MOLECULAR BIOLOGY OF 

THE CELL 

THIRD EDITION 



Bruce Alberts • Dennis Bray 
Julian Lewis • Martin Raff • Keith Roberts 
James D.Watson 




Garland Publishing, Inc. 
New York & London 



GARLAND STAFF 

Text Editor: Miranda Robertson 
Managing Editor: Ruth Adams 
Illustrator: Nigel Orme 

Molecular Model Drawings: Kate Hesketh-Moore 
Director of Electronic Publishing: John M-Roblin 
Computer Specialist: Chuck Bartelt 
Disk Preparation: Carol Winter 
Copy Editor: Shirley M. Cobert 
Production Editor: Douglas Goertzen 
Production Coordinator: Perry Bessas • 
Indexer: Maija Hinkle 

Bruce Alberts received his Ph.D. from Harvard University and is 
currendy President of the National Academy of Sciences and Professor 
of Biochemistry and Biophysics at the University of California, San 
Francisco. Dennis Bray received his Ph.D. from the Massachusetts 
Institute of Technology and is currently a Medical Research Council 

Fellow in the Department of Zoology, University of Cambndge. 
Julian Lewis received his D.Phil, from the University of Oxford and is 
currently a Senior Scientist in the Imperial Cancer Research Fund 
Developmental Biology Unit, University of Oxford. Martin Raff received 
his M.D. from McGill University and is currently a Professor in the MRC 
Laboratory for Molecular Cell Biology and the Biology Department, 
University College London. Keith Roberts received his Ph.D. from the 
University of Cambridge and is currently Head of the Department of Cell 
Biology, the John Innes Institute, Norwich. James D. Watson received his 
Ph D from Indiana University and is currently Director of the Cold Spring 
Harbor Laboratory. He is the author of Molecular Biology of the Gene and, 
with Francis Crick and Maurice Wilkins, won the Nobel Prize in Medicine 
and Physiology in 1962. 



m 



© 1983, 1989, 1994 by Bruce Alberts, Dennis Bray, Julian Lewis, 
MartinRaff, Keith Roberts, and James D. Watson. 

All rights reserved. No part of this book covered by the copyright hereon 
may be reproduced or used in any form or by any means-graphic, 
electronic, or mechanical, including photocopying, recording, taping, or 
information storage and retrieval systems-without permission of the 
publisher. 



Library of Congress Cataloging-in-Publication Data 
Molecular biology of the cell J Bruce Alberts ... let al.].-3rd ed. 
p. cm. 

Includes bibliographical references and index. 

ISBN 0-8153-1619-4 (hard cover).-ISBN 0-8153-1620-8 (pbk.) 

1. Cytology. 2. Molecular biology. I. Alberts, Bruce. 

[DNLM: 1. Cells. 2. Molecular Biology. QH 581.2 M718 1994] 
QH581.2.M64 1994 
574.87— dc20 

DNLM/DLC 93-45907 
for Library of Congress cip 

Published by Garland Publishing, Inc. 
717 Fifth Avenue, New York, NY 10022 

Printed in the United States of America 
15 14 13 12 10 9 8 7 6 5 4 3 2 



Front coven The photograph shows a rat nerve cell . 
in culture. It is labeled ( yellow ) with a fluorescent 
antibody that stains its cell body and dendritic 
processes. Nerve terminals ( green ) from other 
neurons (not visible), which have made synapses on 
the cell, are labeled with a different antibody. 
(Courtesy of Olaf Mundigl and Pietro de CamilU.) 
Dedication page: Gavin Borden, late president 
of Garland Publishing, weathered in dunng his 
mid- 1 980s climb near Mount McKinley with 
MBoC author Bruce Alberts and famous mountaineer 
guide Mugs Stump (1940-1992]. 
Back cover: The authors, in alphabetical order, 
crossing Abbey Road in London on their way to lunch. 
Much of this third edition was written in a house just 
around the corner. (Photograph by Richard Olivier.) 



£ grafts! If these minor cell prdfSThs differ among cells to the same extent as trre 
ftiore abundant proteins, as is commonly assumed, only a small number of pro- 
Nn differences (perhaps several hundred) suffice to create very large differences 
fill cell morphology and behavior. 

A Cell Can Change the Expression of Its Genes 
jn Response to External Signals 3 

.Most of the specialized cells in a multicellular organism are capable of altering 
iilicir patterns of gene expression in response to extracellular cues. If a liver cell 
b exposed to a glucocorticoid hormone, for example, the production of several 
jjwcific proteins is dramatically increased. Glucocorticoids are released during 
ij&riods of starvation or intense exercise and signal the liver to increase the 
production of glucose from amino acids and other small molecules; the set of 
pt»leins whose production is induced includes enzymes such as tyrosine amino- 
^msferase, which helps to convert tyrosine to glucose. When the hormone is no 
fynger present, the production of these proteins drops to its normal level. 

Other cell types respond to glucocorticoids in different ways. In fat cells, for 
pimple, the production of tyrosine aminotransferase is reduced, while some 
ilher cell types do not respond to glucocorticoids at all. These examples illustrate 
) funeral feature of cell specialization — different cell types often respond in dif- 
;!>r«nt ways to the same extracellular signal. Underlying this specialization are 
Quires that do not change, which give each cell type its permanently distinc- 
>ti character. These features reflect the persistent expression of different sets of 
5hes. 



4kuie Expression Can Be Regulated at Many of the Steps 
fy\ the Pathway from DNA to RNA to Protein 4 

i j differences between the various cell types of an organism depend on the par- 
. iiilnr genes that the cells express, at what level is the control of gene expression 
Wt:ised? There are many steps in the pathway leading from DNA to protein, and 
!) of them can in principle be regulated. Thus a cell can control the proteins it 
3kos by (1) controlling when and how often a given gene is transcribed (tran- 
ftptional control), (2) controlling how the primary RNA transcript is spliced or 
IHtirwise processed (RNA processing control), (3) selecting which completed 
$NAs in the cell nucleus are exported to the cytoplasm (RNA transport con- 
! Ml (4) selecting which mRNAs in the cytoplasm are translated by ribosomes 
Hinslational control), (5) selectively destabilizing certain mRNA molecules in 
^ cytoplasm (mRNA degradation control), or (6) selectively activating, inacti- 
jiihg, or compartmentalizing specific protein molecules after they have been 
Mk (protein activity control) (Figure 9-2). 

l*or most genes transcriptional controls are paramount. This makes sense 
,1 mise, of all the possible control points illustrated in Figure 9-2, only transcrip- 
iftttl control ensures that no superfluous intermediates are synthesized. In the 



inactive'mRNAj 




Figure 9-2 Six steps at which 
eucaryote gene expression can be 
controlled. Only controls that operate 
at steps 1 through 5 are discussed in 
this chapter; The regulation of protein 
activity (step 6) is discussed in 
Chapter 5; this includes reversible 
activation or inactivation by protein 
phosphorylation as well as 
irreversible inactivation by proteolytic 
degradation. 



t(|)vcrview f Gene Control 



403 



following sections we discuss the DNA and protein components that regulate the 
inSon of gene transcription. We return at the end of the chapter to the other 
ways of regulating gene expression. 

Summary 

The Rename of a cell contains in its DNA sequence the information to make many 
thousands of different protein and RNA molecules. A cell typically expresses only a 
Taction of its genes, and the different types of cells in multicellular organisms anse 
because different sets of genes are expressed. Moreover, cells can change the pattern 
of genes they express in response to changes in their environment, such as signals from 
other cells. Although all of the steps involved in expressing a gene can m Vnnciplebe 
regulated, for most genes the initiation of RNA transcription « the most important 
point of control 



; Motifs in Gene 
Regulatory Proteies 5 

How does a cell determine which of its thousands of genes to ^ ctihe l^J^ 
cussed in Chapter 8, the transcription of each gene is controlled by a regulatory 
region of DNA near the site where transcription begins. Some regulatory regions 
are simple and act as switches that are thrown by a single signal. 0^ ■regula- 
tory regions are complex and act as tiny microprocessors, responding to a van- 
et7of signals that they interpret and integrate to switch the neighbonng gene on 
or off Whether complex or simple, these switching devices consist of two fun- 
damental types of components: (1) short stretches of DNA of defined sequence 
and (21 sene regulatory proteins that recognize and bind to them. 

We begL our discussion of gene regulatory proteins by describing how these 
proteins were discovered. 

Gene Regulatory Proteins Were Discovered Using 
Bacterial Genetics 6 

Genetic analyses in bacteria carried out in the 1950s provided the first evidence 
of the existence of gene regulatory proteins that turn specific sets of gene on 
or off. One of these regulators, the lambda repressor, is encoded by a bacterial 
virus, bacteriophage lambda. The repressor shuts off the viral genes that code for 
the protein components of new virus particles and thereby enables the viral ge- 
nome to remain a silent passenger in the bacterial chromosome, multiplying with 
the bacterium when conditions are favorable for bacterial growth (see Figure 
6-80) The lambda repressor was among the first gene regulatory proteins to be 
characterized, and it remains one of the best understood as we discuss later 
Other bacterial regulators respond to nutritional conditions by shutting off genes 
encoding specific sets of metabolic enzymes when they are not needed. The lac 
repressor, for example, the first of these bacterial proteins to be recognized, turns 
off the production of the proteins responsible for lactose metabolism when this 
suear is absent from the medium. . 

The first step toward understanding gene regulation was the isolation i o 
mutant strains of bacteria and bacteriophage lambda that were unable to shu 
off specific sets of genes. It was proposed at the time, and 1^ ttaj. mo« 
of these mutants were deficient in proteins acting as s P eclflc ' e P reSS ^ S t ^^ S ^ 
sets of genes. Because these proteins, like most gene regulatory proteins are 
present in small quantities, it was difficult and time-consuming to isolate them. 
They were eventually purified by fractionating cell extracts on a series , ol ^stan- 
dard chromatography columns (see pp. 166-169). Once isolated, he pro- 
teins were shown to bind to specific DNA sequences close to the genes that they 

404 Chapter 9 : Control of Gene Expression 



Figure 9-3 DoublivMW$#3 
of DNA. The major ami ' 
on the outside of \\w tloi#llMl» 
indicated. The atom,* m$% 
follows: carbon, dark hmpM^ 
light blue; hydrogen, ii'iifef 
red; phosphorus, yt'tfatf 



•V^it*. The p 
1 Igiunbinat 
kjimmts {( 

"Mfttitsid 

Vuuscd i 

OtH|unu;( 

wuK' fli ihr inti 
-ir * *iHhiM'. It 

having 

tlM 1 1. H'N, I 
■m m in 

MH). r< 




# 



eve stripe 2 
forms here 



— Giant 


— ..„. .„_ j 

t 








KruppelX j 




Hunchback© 


©/ i 


\ i 

* \ Bicoid 


' • \ / 1 



anterior | position along embryo . posterior - 



t ) strung out over 20,000 nucleotide pairs of DNA and binds more than 20 differ- 
ent proteins. A large and complex control region is thereby built from a series of 
UWiller modules, each of which consists of a unique arrangement of short DNA 
j:4|uences recognized by specific gene regulatory proteins. An important require- 
- m\\ of this strategy is the absence of cross-talk between the modules: the state 
I One module should not affect that of the others. How this molecular insula- 
x >)\\ is achieved is unknown. In this way, however, a single gene can respond to 
: enormous number of combinatorial inputs. 

implex Mammalian Gene Control Regions Are Also 
t instructed from Simple Regulatory Modules 33 

\ hm been estimated that several percent of the coding capacity of a mamma- 
^ genome is devoted to the synthesis of proteins that serve as regulators of gene 
Ascription. This reflects the exceedingly complex controls that regulate the ex- 
Onion of mammalian genes. It is not unusual, for example, to find a gene with 
mural region that is 50,000 nucleotide pairs in length, in which many mod- 
> ^ «ach containing a number of regulatory sequences that bind gene regula- 
proteins, are interspersed with long stretches of spacer DNA. 
One of the best-understood examples of a complex mammalian regulatory 
'nn is found in the human (i-globin gene, which is expressed exclusively in red 
!Ht Cells and at a specific time in their development. A complex array of gene 
^Ifilory proteins controls the expression of the gene, some acting as activators 
} 'Others as repressors (Figure 9-45). The concentrations (or activities) of many 
n-Jwe gene regulatory proteins are thought to change during development, and 



Figure 9-43 Distribution of the gene 
regulat ry proteins responsible for 
ensuring that eve is expressed in 
stripe 2. The distributions of these 
proteins were visualized by staining a 
developing Drosophila embryo with 
antibodies directed against each of 
the four proteins (Figure 9-39). The 
expression of eve in stripe 2 occurs 
only at the position where the two 
activators (Bicoid and Hunchback) 
are present and the two repressors 
(Giant and Kriippel) are absent. In fly 
embryos that lack Kriippel, for 
example, stripe 2 expands posteriorly. 
Likewise, stripe 2 expands posteriorly 
if the DNA- binding sites for Kriippel 
in the stripe 2 module (see Figure 
9-41) are inactivated by mutation and 
this regulatory region is reintroduced 
into the genome. 

The eve gene itself encodes a 
gene regulatory protein, which, after 
its pattern of expression is set up in 
seven stripes, in turn regulates the 
expression of other Drosophila genes. 
As development proceeds, the embryo 
is thus subdivided into finer and finer 
regions that eventually give rise to the 
different body parts of the adult fly, as 
discussed in Chapter 21. 



strongly activating 
assembly 



etrongly 
Inhibiting 
protein 



Wftftkly 
££M voting 

fWQteln 
^ftmbly 



silent assembly of 
regulatory proteins 




Figure 9-44 Integration at a 
promoter. Multiple sets of gene 
regulatory proteins can work together 
to influence a promoter, as they do in 
the eve stripe 2 module illustrated 
previously in Figure 9-42. It is not yet 
understood in detail how the 
integration of multiple inputs is 
achieved. 



TATA 



^Octlc Switches Work 



429 




only a particular combination of all the proteins triggers transcription of the gene. 
We see later that the (i-globin gene is also subject to a second, higher layer of 
control that involves global changes in chromatin structure. 

The Activity of a Gene Regulatory Protein 
Can Itself Be Regulated 34 

The strategies for regulating the eve gene and the human (3-globin gene are similar 
in that the gene control regions respond to a bewildering array of gene regula- 
tory proteins. Drosophila is unusual, however, in the way that the spatial distri- 
bution of gene regulatory proteins in the cytoplasm controls gene expression. 
As discussed previously, the early Drosophila embryo is a single giant cell that 
contains thousands of nuclei in a common cytoplasm, and the gene regulatory 
proteins themselves are distributed in complex spatial patterns so that different 
nuclei are exposed to different concentrations of the proteins. These gene regu- 
latory proteins enter the nuclei directly to activate or repress transcription of their 
target genes. In most embryos of other organisms, individual nuclei are in sepa- 
rate cells, and extracellular positional information must either pass across the 
plasma membrane or, more usually, generate signals in the cytosol in order to 
influence the genome. 

The mechanisms by which extracellular signals communicate their message 
across the plasma membrane to gene regulatory proteins inside the cell are dis- 
cussed in Chapter 15. Here we need deal only with the final steps in the intra- 
cellular signaling cascades activated by extracellular signals — the steps in which 
the activity of gene regulatory proteins is altered. In many cases the gene regu- 
latory protein is present in the cell in an inactive form and a signal alters the 
protein so as to activate it. The protein may be activated by phosphorylation 
catalyzed by a protein kinase, for example, or it may be released from a tight 
complex with a second protein that otherwise holds the gene regulatory protein 
in the cytosol, preventing it from entering the nucleus. These and some other 
ways of controlling the activity of gene regulatory proteins are illustrated in Figure 
9-46. 



Figure 9-45 Model for the ;|| 
of the human p- glob in gent*. Ifei 
diagram shows some of the K(W 
regulatory proteins thought loftyiji' 
expression of the gene during 
blood cell development (see 
9-52). Some of the gene regul$|p.\ 
proteins shown, such as CIM, 
found in many types of eel in, tt'jifljjj 
others, such as GATA-1, are |M'|iHlj|: 
in only a few types of cells imlt&Siif| 
red blood cells and therefore hi? 
thought to contribute to the oollih^ 
specificity of p-globin gene 
expression. As indicated by thf 
double-headed arrows, sevei'fit§jl$* 
binding sites for GATA-1 overlap 
those of other gene regulatory 
proteins; it is thought that oe<!ij|n|ij 
of these sites by GATA-1 exclutini 
binding of other proteins. (Aciftj&ftli 
from B. Emerson, In Gene lix|Mf4|i; 
General and Cell-Type Specific: 
(M. Karin, ed.), pp. 116-161. Hft|»g 
Birkhauser, 1993.) 



Bacteria Use Interchangeable RNA Polymerase Sobunits 
to Help Regulate Gene Transcription 35 

We have seen the importance of gene regulatory proteins that bind to regulatory 
sequences in DNA and signal to the transcription apparatus whether or not to 
start the synthesis of an RNA chain. Although this is the main way of controlling 
transcriptional initiation in both eucaryotes and procaryotes, some bacteria and 



430 Chapter 9 : Control of Gene Expression 



t PROTEIN 
SYNTHESIS 



INACTIVE 



ACTIVE 



(A) 



PROTEIN 
PHOSPHORYLATION 




ADDITION OF 
SECOND SUBUNIT 



h 



DNA-binding 
subunit 

>=s? activation 
subunit 

(D) 



"UNMASKING 

[&\ ^inhibitor 



(E) 



STIMULATION OF 
NUCLEAR ENTRY 



inhibitory 
Protein 



C 



(F) 



nucleus 



IliHt viruses use an additional strategy based on interchangeable subunits of RNA 
polymerase. As described in Chapter 8, a sigma (a) subunit is required for the 
Material RNA polymerase to recognize a promoter. Some bacteria make several 
llll'lcrent sigma subunits, each of which can interact with the RNA polymerase 
Hire and direct it to different specific promoters. This scheme permits one large 
m of genes to be turned off and a new set to be turned on simply by replacing 
Wh: sigma subunit with another. The strategy is efficient because it bypasses the 
twv.d to deal with the genes one by one, and it is often used by bacterial viruses 
1o activate several sets of genes rapidly and sequentially (Figure 9-47). 

In a sense, eucaryotes employ an analogous strategy through the use of three 
illjiiinct RNA polymerases (I, II, and III) that share some of their subunits. 
J'lncaryotes, in contrast, use only one type of core RNA polymerase molecule but 
jfflodify it with different sigma subunits. 

(Jcne Switches Have Gradually Evolved 

Wi; have seen that the control regions of eucaryotic genes are often spread out 
i»v<:r long stretches of DNA, whereas those of procaryotic genes are typically 
. closely packed around the start point of transcription. Several bacterial gene regu- 
latory proteins, however, recognize DNA sequences that are located many nucle- 
otide pairs away from the promoter. The example of DNA looping in E. coli shown 
'previously in Figure 9-32 resembles the way that eucaryotic gene regulatory pro- 
lans act at a distance. In fact, this case provided one of the first examples of DNA 
looping in gene regulation and gready influenced later studies of eucaryotic gene 
#Kiilatory proteins. 



Figure 9-46 Some ways in which the 
activity of gene regulatory proteins is 
regulated in eucaryotic cells. (A) The 

protein is synthesized only when 
needed and is rapidly degraded by 
proteolysis so that it does not 
accumulate. (B) Activation by ligand 
binding. (C) Activation by 
phosphorylation. (D) Formation of a 
complex between a DNA-binding 
protein and a separate protein with a 
transcription- activating domain. (E) 
Unmasking of an activation domain 
by the phosphorylation of an 
inhibitor protein. (F) Stimulation of 
nuclear entry by removal of an 
inhibitory protein that otherwise 
keeps the regulatory protein from 
entering the nucleus. 



RNA polymerase with 
bacterial sigma factor 



RNA polymerase with 
viral sigmalike factor 




/ l I I I I I ^ VIRAI DNA 



early genes 



middle genes 



late genes 



Figure 9-47 Interchangeable RNA polymerase subunits as a strategy to control gene expression in a bacterial virus. 

The bacterial virus SPOl, which infects the bacterium B. subtilis, uses the bacterial polymerase to transcribe its early 
genes. One of the early genes, called 28, encodes a sigmalike factor that binds to RNA polymerase and displaces the 
bacterial sigma factor. This new form of polymerase specifically initiates transcription of the SPOl "middle" genes. 
One of the middle genes encodes a second sigmalike factor that displaces the 28 product and directs RNA polymerase 
to transcribe the "late" genes. This last set of genes produces the proteins that package the virus chromosome into a 
virus coat and lyse the cell. Thus, by this strategy, sets of virus genes are expressed in a particular order, allowing for 
rapid, yet temporally controlled, viral replication. 



jlow Genetic Switches Work 



431 




It seems likely that the close-packed arrangement of bacterial genetic 
switches developed from more extended forms of switches in response to the 
evolutionary pressure on bacteria to maintain a small genome size. (The same 
argument has been used to explain the lack of introns in bacteria, as discussed 
in Chapter 8.) This compression comes at a price, however, as it is difficult to 
imagine how the compact switches could be easily altered to incorporate new 
levels of control. The extended form of eucaryotic control regions, in contrast, 
with discrete regulatory modules separated by long stretches of spacer DNA, 
would be expected to facilitate reshuffling of modules during evolution, both to 
create new regulatory circuits and to modify old ones. Unraveling the history of 
how gene control regions evolved presents a fascinating challenge, and many 
clues can be found in present-day DNA sequences (Figure 9-48). 

Summary 

The transcription of individual genes is switched on and off in cells by gene regula- 
tory proteins. In procaryotes these proteins usually bind to specific DNA sequences 
close to the RNA polymerase start site and, depending on the nature of the regulatory 
protein and the precise location of its binding site relative to the start site, either 
activate or repress transcription of the gene. The flexibility of the DNA helix, however, 
also allows proteins bound at distant sites to affect the RNA polymerase at the pro- 
moter by the looping out of the intervening DNA. Such action at a distance is ex- 
tremely common in eucaryotic cells, where gene regulatory proteins bound to se- 
quences thousands of nucleotide pairs from the promoter can control gene expression. 

Although procaryotic RNA polymerases can initiate transcription on their own, 
eucaryotic polymerases require the prior assembly of general transcription factors at 
the promoter. These factors assemble in a particular order, beginning with the bind- 
ing ofTFIID to the TATA box, a DNA sequence found just upstream of most eucary- 
otic RNA polymerase start sites. The ordered assembly of general transcription fac- 
tors provides several steps at which the initiation of transcription can be regulated, 
and many eucaryotic gene regulatory proteins are thought to work by facilitating 
(positive control) or hindering (negative control) the assembly process. 

Whereas the transcription of a typical procaryotic gene is controlled by only one 
or two gene regulatory proteins, the regulation of higher eucaryotic genes is much 
more complex, commensurate with the larger genome size and the large variety of cell 
types. The control region of the Drosophila eve gene, for example, encompasses 20,000 
nucleotide pairs of DNA and has binding sites for over 20 gene regulatory proteins. 



Figure 9-48 A comparison of p:n ■ 
the control region upstream from 
the engrailed gene in two specif. • 
Drosophila. A DNA sequence 
comparison between Drosophiln 
melanogaster and Drosophila rial- 
shown, with regions of 90% sei|u. , 
conservation shown in red. On< 
example of an actual sequence m .■■ 
is illustrated in detail at the top. 1 1 
conserved sequences presumabh 
mark the sites where importani yy 
regulatory proteins bind, when 
loops indicate places where insci i > 
or deletions of nucleotides have 
occurred since these two specie;, 
evolved from a common ancesim 
about 60 million years ago. (Conn 
of Judith A. Kassis and Patrick 1 1 
O'Farrell.) 



432 Chapter 9 : Control of Gene Expression 



>\, .»/,. oji these proteins are transcriptional activators, while others are transcriptional 
• .vn-ssors. These proteins bind to regulatory sequences organized in a series ofregu- 
l.noiy modules strung together along the DNA. 



.ill 
1 1 



» ! i'omatin Structure and the Control 
"I :ene Expression 36 

v •■ .liscussed in Chapter 8, the genomes of eucaryotes are highly compacted to 
»"■ >w ihe very long DNA molecules to fit inside the cell and to be managed easily, 
lust level of compaction is the wrapping of DNA around histones to form 
liosomes. In a second level of compaction nucleosomes are packed into 30- 
l i laments. Finally, an even higher order of packing (still poorly understood) 
I 'served in heterochromatin, which is confined to selected regions of the ge- 
i that show an unusually condensed interphase structure. 
I low do gene regulatory proteins and the general transcription factors gain 
■ •ss to DNA that is packed into these compact protein-DNA structures, and 
»...w <loes the packing affect the control of gene expression? We see in this sec- 
i hat two general principles have emerged from studies of chromatin struc- 
and its influence on gene expression. First, nucleosomes do not usually 
i -lit a serious obstruction to either gene regulatory proteins or RNA poly- 
ises. Enhancers can still function despite them, histones that block a pro- 
•r can be displaced, and, once transcription has begun, Pol II can transcribe 
ugh the nucleosomes without dislodging them. Even bacterial polymerases, 
h do not encounter nucleosomes in vivo, can transcribe through them, sug- 
k< niig that the nucleosome is built to be traversed easily (Figure 9-49). The sec- 
general principle is that some forms of higher-order DNA packaging render 
I )NA inaccessible both to gene regulatory proteins and to the general tran- 
>• .ipnon factors. Higher-order DNA packaging thus plays a crucial part in the 
' ""' ml of 8 ene expression in eucaryotes, serving to silence large sections of the 
f inline— in some cases reversibly, in other cases not. 

I r; inscription Can Be Activated on DNA That Is Packaged 
i 1 1 1 < > Nucleosomes 37 

ti. i he previous section we described a simple model for how transcription of a 
r>.. .„ yotic gene is activated by a gene regulatory protein (see Figure 9-36). How 



I 

tut i* a 

m r | ; 
(Jlinl 




Figure 9-49 A tentative model to 
account for the ability of RNA 
polymerases to transcribe through 
nucleosomes without causing their 
displacement. The polymerase first 
displaces an H2A-H2B dimer (see 
Figure 8-10), allowing the polymerase 
to enter the nucleosome. In the next 
step the polymerase pulls the DNA 
away from the H3-H4 dimer it next 
encounters and continues 
transcribing. The displaced H2A-H2B 
dimer is recaptured by the 
nucleosome, and the second, 
symmetrically disposed H2A-H2B 
dimer is now displaced, allowing the 
process to repeat and permitting the 
polymerase to exit from the 
nucleosome. (After K.E. van Holde et 
al., J. Biol. Chem. 267:2837-2840, 
1992.) 



f )>i»inatin Structure and the Control of Gene Expression 



must this model be modified to take account of the presence of nucleosomes? To 
begin with the first step, how do nucleosomes affect the binding of a gene acti- 
vator protein to its regulatory DNA sequence? In some cases the regulatory se- 
quences reside in short nucleosome-free regions, and so no problem arises. It is 
uncertain how such nucleosome-free regions are maintained, but, as discussed 
in Chapter 8, some stretches of DNA are too stiff to accommodate the tight fold- 
ing necessary for nucleosome formation. In other cases, however, the regulatory 
sequence is packaged into a nucleosome and yet at least some gene activator 
proteins can still recognize and bind to it. Once bound, the regulatory proteins 
appear to destabilize the nucleosome, which is then at least partially disas- 
sembled. Which types of gene regulatory proteins can achieve this feat and how 
they accomplish it remain unknown. 

The general transcription factors, in contrast, seem unable to assemble onto 
a promoter that is packaged into a nucleosome. In fact, such packaging may have 
evolved in part to ensure that leaky, or basal, transcription initiation (that is, 
initiation without a gene activator protein bound upstream) does not occur. The 
binding of a gene activator protein thousands of nucleotide pairs away from a 
nucleosome-packaged promoter, however, can apparently displace a nucleosome 
from a promoter and thereby allow the assembly of the general transcription 
factors The displacement either could be due to a separate activity of the gene 
activator protein or could be an indirect consequence of the activator contact- 
ing the general transcription factors to facilitate their assembly on the DNA (Fig- 
ure 9-50). 

Some Forms of Chromatin Silence Transcription 38 

Although transcription can occur on DNA that is packaged into nucleosomes, the 
DNA in some special forms of chromatin appears to be inaccessible to gene ac- 



binding 

site for activator 
activator protein 




ACTIVATOR 
DIRECTLY 
DISASSEMBLES 
NUCLEOSOMES 



TATA 




ACTIVATOR ALSO 

FACILITATES 
THE ASSEMBLY OF 
THE GENERAL 
TRANSCRIPTION FACTORS 




if assembly of general 
£-1 / transcription factors 



TATA 




Figure 9-50 One model to explain 
the displacement of nucleosomes 
during the initiation of transcription 
in eucaryotes. A bound gene activate 
protein possesses a separate activity 
that directly removes nucleosomes 
from the promoter, exposing the 
promoter to the general transcript iim 
factors and enabling them to 
assemble. In a different model (not 
shown) nucleosome displacement 
occurs as an indirect consequence <>l 
the activator protein promoting the 
assembly of the general transcript ion 
factors; the activator protein, for 
example, could permit the general 
factors to begin to assemble in the 
presence of a nucleosome, and, oik c 
partially assembled, these factors 
would destabilize the nucleosome. 
The process underlying nucleosomr 
displacement is poorly understood. 
The histones may simply leave the 
DNA or, according to an alternative 
model, they may disassemble but 
remain bound to DNA (see Figure 
8-40). 



434 Chapter 9 : Control of Gene Expression 




telomere 



□ 



ADE2gene at normal location 
on chromosome 



white colony of 
yeast cells 




red colony of 
yeast cells 
with white sectors 



zm 



~\ 

ADE2 gene moved near telomere 



white gene 




iCiior proteins. These inactive forms of chromatin, including the especially 
;)hly condensed form called heterochromatin (discussed in Chapter 8), are as- 
;!*MKi to contain special proteins that make the DNA unusually inaccessible. 

An observation in the yeast S. cerevisiae illustrates how some types of chro- 
)(ln can shut off gene transcription. The ADE2 gene, whose expression is par- 
Xilorly easy to monitor, is expressed when present at its normal chromosomal 
'.utlion. When this gene is experimentally relocated to the end of a chromosome, 
ftvover, its transcription is turned off, even though the cell contains all of the 
luteins required to transcribe the gene. The DNA near the ends of yeast chro- 
jwomes (the telomeres) is packaged into an especially inaccessible form of 
iiOtnatin, and it is this packaging that is thought to be responsible for maintain- 
j Hie translocated ADE2gene in an inactive state, a process called silencing. The 
dicing of genes located near chromosome ends extends for approximately 
;MM)0 nucleotide pairs and applies to many genes in addition to ADE2; the 
Cueing seems to weaken gradually with distance from the telomere. The 
^hanism of silencing is not known, but it seems likely to involve a coopera- 
assembly of proteins on the DNA that, once established, is heritable following 
$A replication (Figure 9-51 A). 

The silencing of the ADE2 gene is an example of a position effect, in which 
p activity of a gene is dependent on its position in the genome. Position effects 
M first recognized in Drosophila (Figure 9-51B), but they have now been ob- 
<Wed in a number of other organisms and are thought to reflect the different 
)l&t of chromatin present at different locations in the genome and the tendency 
I* these states to spread to encompass nearby genes. We revisit this topic later 
vM'mj chapter when we analyze mechanisms of cell memory. 



Figure 9-51 P siti n effects on gene 
expression. (A) The yeast ADE2 gene 
at its normal chromosomal location is 
expressed in all cells. When moved 
near the end of a yeast chromosome, 
the gene is silenced in most but not 
all cells of the population. The 
absence of the ADE2 gene product 
results in a block in the adenine 
biosynthetic pathway, which leads to 
the accumulation of a red pigment. 
The founder cell for the sectored 
colony shown had its ADE2 gene shut 
off, so most of the colony is red. 
Normally, yeast colonies are white. 
The white sectors at the edges of the 
red colony are clones of cells where 
the ADE2 gene has spontaneously 
become active. The finding of such 
sectoring indicates that the active and 
inactive states of ADE2 expression are 
heritable when this gene is near the 
telomere, a topic discussed in more 
detail in the next section. 

(B) Position effects can also be 
observed for the Drosophila white 
gene. Wild-type flies with a normal 
white gem have red eyes. If the white 
gene is inactivated by mutation, the 
eyes become white (hence the name 
of the gene). In flies with a 
chromosomal inversion that moves 
the white gene near a 
heterochromatic region, the eyes are 
mottled, with red and white patches. 
The white patches represent cells 
where the white gene is silenced and 
red patches represent cells that 
express the white gene. The difference 
is thought to arise from variations in 
how far along the chromosome the 
heterochromatin spreads early in eye 
development. As in the case of yeast 
ADE2 gene, once established, the 
state of white expression is heritable, 
producing patches of many cells that 
express white as well as patches of 
cells where white is silenced. (After 
L.L. Sandell and V.A. Zakian, Trends 
Cell Biol 2:10-14, 1992.) 



fyomatin Structure and the C ntr 1 f Gene Expressi n 



435 




An Initial Decondensation Step May Be Required Before 
Mammalian Globin Genes Can Be Transcribed 39 

Another example of how highly condensed chromatin can prevent gene expres- 
sion comes from studies of chick and human P-globin gene clusters. The five 
genes of the cluster, spread over 50,000 nucleotide pairs of DNA, are transcribed 
exclusively in erythroid cells (that is, cells of the red blood cell lineage). More- 
over, each gene is turned on at a different stage of development (Figure 9-52) and 
in different organs: the e-globin gene is expressed in the embryonic yolk sac, 
y in the yolk sac and the fetal liver, and 5 and (3 primarily in the adult bone 
marrow. We previously described a series of gene regulatory proteins that are 
necessary to turn on the human p-globin gene at the appropriate time and place 
(see Figure 9-45), and each of the other globin genes has a similar set of regu- 
latory proteins, many of which are shared among these genes. In addition to the 
individual regulation of each of the globin genes, however, the entire cluster ap- 
pears to be subject to an on-off control that involves global changes in chromatin 
structure. 

Some of the first evidence for such changes came from studies of the sensi- 
tivity of the globin genes in isolated nuclei to digestion by the nuclease enzyme 
DNaseL In cells where the globin genes are not expressed, the DNA in these genes 
is resistant to DNasel, indicating that they are tightly packaged into chromatin. 
In erythroid cells, by contrast, the entire gene cluster is sensitive to DNasel, in- 
dicating that the chromatin has changed to make the DNA more accessible to the 
enzyme. The DNA is still folded into nucleosomes, but the higher-order packing 
of the chromatin has loosened. This change in DNA packing occurs even before 
the individual globin genes are transcribed, suggesting that the genes are regu- 
lated in two steps. In the first step the chromatin of the entire globin locus is 
decondensed, which is presumed to allow some of the gene regulatory proteins 
access to the DNA. In the second step the remaining gene regulatory proteins 
assemble on the DNA and direct the expression of individual genes (Figure 9-53). 

The extensive change in chromatin structure that occurs in the first step is 
thought to require a region of DNA (called the locus control region, or LCR) that 
lies far upstream from the gene clusters (see Figure 9-52). The importance of the 
LCR can be seen in patients with a certain type of thalassemia, a severe genetic 
form of anemia. In these patients the P-globin locus is found to have undergone 
deletions that remove all or part of the LCR, and although the p-globin gene and 
its nearby regulatory regions are intact, the gene remains silent in erythroid cells. 
Moreover, the p-globin gene in the erythroid cells remains DNasel resistant, in- 
dicating that it fails to undergo the normal chromatin decondensation step dur- 
ing erythroid cell development. 

Subsequent experiments in transgenic mice have confirmed the profound 
effects of the LCR on the expression of globin genes. When, for example, the 
human p-globin gene plus its local regulatory sequences (the region shown in 
Figure 9-45) is inserted into different positions in the mouse genome, it is ex- 
pressed at low levels that depend on the site of insertion. This behavior is typi- 
cal for mammalian genes, and it indicates that local position effects influence the 
expression of the gene (Table 9-2). When the LCR is included with the gene, 



Figure 9-52 The cluster of P-liku 
globin genes in humans. (A) The 

large chromosomal region shown 
spans 100,000 nucleotide pairs am) 
contains the five globin genes ami ;» 
locus control region (discussed in ilu* 
text). (B) Changes in the expression t*i 
the p-like globin genes at various 
stages of human development. I*;u It 
of the globin chains encoded by ihrs* 
genes combines with an a-globin 
chain to form the hemoglobin in ml 
blood cells. (A, after F. Grosveld, ( i.n 
van Assendelft, D.R. Greaves, and ( , 
Kollias, Cell 51:975-985, 1987. ©O-li 
Press.) 



STAGE 1 
ALTERS CHROMATIN 
STRUCTURE 

i 



STAGE 2 
TURNS ON GENE 



RNA polymi»*« 




gene 
regulatory 
protein 



RNA- 



Figure 9-53 The two stages 
postulated to be involved in sonic 
gene activations, such as that of thr 
human globin gene cluster. In sU\yr i 

the structure of a large local region oi 
chromatin is modified to decondcnM' 
it in preparation for transcription. In 
stage 2 gene regulatory proteins 
(represented by a single protein in 
this simplified figure) bind to speciln 
sites on the altered chromatin to 
induce RNA synthesis (transcription! 



436 Chapter 9 : Control of Gene Expression 



^ ; • 

■Mi* 9-2 The Expr ssion of a G ne Transferred to a Mouse Generally Shows 
Chromos m Position Effects, with Different Levels of G n Activity 
In Independently Derived Transg nic Animals 



Perc nt of Total mRNA in 





Yolk Sac 


Liver 


Gut 


Brain 


GeneCopi s 
per Cell 


Mitogen us gene 


20 


5 


0.1 


0 


2 


(iKjhfcgenic animal 1 


3.4 


1 


0.1 


0 


4 


iiJlil^genic animal 2 


4.8 


30 


1.3 


0 


4 


s&0$genic animal 3 


4.4 


13 


4.7 


0 


4 


ftf$i).8genic animal 4 


0.4 


0.4 


0 


0 


12 



\)jh&so experiments, carried out with the mouse alpha-fetoprotein gene, the DNA fragment 
■."fW-'Uid into the fertilized mouse egg included 14,000 nucleotide pairs of upstream (S'-flank- 
' J Hiquence, where three enhancers that affect the expression of this gene are located. Hybrid- 
JllOn was used to compare the level of mRNA produced by the injected gene to that normally 
HWuced by the endogenous mouse alpha-fetoprotein gene in the indicated fetal tissues. (Data 
K.E. Hammer et ah, Science 235:53-58, 1987.) 



i 

l^vover, p-globin is expressed at high levels in erythroid cells regardless of the 
jfti of insertion, indicating that the LCR can override these position effects. 

Although several proteins that specifically bind to the LCR have been iden- 
■■$teCl, the mechanism that alters the chromatin structure of the entire p-globin 
-' Wis is not known. Some ideas for how such changes may be brought about are 
%§tissed in the next section. 

Mechanisms That Form Active Chromatin 
Not Understood 

■Jte hypothetical model for globin activation outlined in Figure 9-53 implies that 
i^riryotes may contain sequence-specific DNA-binding proteins that function 

0 ) ((^condense the chromatin in a local chromosomal domain that extends for 
H*>M>f thousands of DNA nucleotide pairs. Alternatively, the observed differences 

Ihe chromatin structure of active genes could be an automatic consequence 
3 the assembly of transcription factors or RNA polymerase (or both) onto a pro- 
liner rather than being a prerequisite for these events. We saw earlier that the 
j^tnnbly of the general transcription factors at promoters appears to be accom- 
3illod by changes in nucleosome distribution at the assembly site; perhaps this 
:jM)j perturbation can spread for long distances by some unknown propagation 
■ H^hanism. 

1 Whether or not eucaryotes turn out to have proteins that are specifically 
Signed to decondense domains of chromatin, it is worth speculating on how 

< jjtMOins might accomplish this task. At present we can only guess at the mecha- 
4^m, Three possibilities are outlined in Figure 9-54. These very different types 
t models indicate how far we are from understanding the transition from inac- 
■JHMO active chromatin. 

lUpcrhelical Tension in DNA Allows Action at a Distance 40 

of the three models outlined in Figure 9-54 invokes topological changes in 
jcifKsed loop of DNA double helix that can lead to the formation of DNA super- 
■.$11*. a conformation that DNA adopts in response to superhelical tension. DNA 
jpercoiling is most readily studied in small circular DNA molecules, such as the 
SifOniosomes of some viruses and plasmids. The same considerations apply, 
1 WCiver, to any region of DNA bracketed by two ends that are unable to rotate 
Mly—as, for example, in a loop of chromatin that is tighdy clamped at its base. 



chromatin Structure and the C ntr 1 f Gene Expression 




'transient signal 
turns on expression 
protein A of protein A 

is not made 
because it is 
normally required 
for its own 
transcription 





the effect of 
the transient signal 
is remembered in 
all of the cell's 

descendants 




Figure 9-61 Schematic diagram 
showing how a positive feedback 
loop can create cell memory. Vn0\\\ 
A is a gene regulatory protein tho* 
activates its own transcription. A!l#t 
the descendants of the original . 
will therefore "remember" that Oif 
progenitor cell had experienced it 
transient signal that initiated iIhi 
production of the protein. 



sor protein. In the absence of such interference, however, the lambda repressor 
both turns off production of the cro protein and turns on its own synthesis, and 
this positive feedback loop helps to maintain the prophage state. Positive feed- 
back loops are a feature of many cell memory circuits (Figure 9-61). 

Bacteriophage lambda illustrates an important general principle: a sophis- 
ticated pattern of inherited behavior can be achieved with only a few gene regu- 
latory proteins that reciprocally affect one another's synthesis and activities. We 
know that variations of this simple strategy are used by eucaryotic cells to estab- 
lish and maintain heritable patterns of gene transcription. Several gene regula- 
tory proteins that are involved in establishing the Drosophila body plan (dis- 
cussed in Chapter 21), for example, stimulate their own transcription, thereby 
creating a positive feedback loop that promotes their continued synthesis; at the 
same time these proteins repress the transcription of genes encoding other im- 
portant gene regulatory proteins. 

Expression of a Critical Gene Regulatory Protein 
Can Trigger Expression of a Whole Battery 
of Downstream Genes 45 

In general, a combination of multiple gene regulatory proteins, rather than a 
single protein, determines where and when a gene is transcribed in eucaryotes. 
But even if control is combinatorial, a single gene regulatory protein can be de- 
cisive in switching a cell from one developmental pathway or state of differen- 
tiation to another. A striking example comes from experiments on muscle cell 
differentiation in vitro, 

A mammalian skeletal muscle cell is typically extremely large and contains 
many nuclei. It is formed by the fusion of many muscle precursor cells called 
myoblasts. The mature muscle cell is distinguished from other cells by a large 
number of characteristic proteins, including specific types of actin, myosin, tro- 
pomyosin, and troponin (all part of the contractile apparatus), creatine phospho- 
kinase (for the specialized metabolism of muscle cells), and acetylcholine recep- 
tors (to make the membrane sensitive to nerve stimulation). In proliferating 
myoblasts these muscle-specific proteins and their mRNAs are absent or are 
present in very low concentrations. As myoblasts begin to fuse with one another, 
the corresponding genes are all switched on coordinately as part of a general 
transformation of the pattern of gene expression. 

This entire program of muscle differentiation can be triggered in cultured 
skin fibroblasts and certain other cell types by introducing any one of a family 



Figure 9-62 The 
effect of expressing 
the MyoD protein in 
fibroblasts. As shown 
in this immuno- 
fluorescence: 
micrograph, skin 
fibroblasts from a 
chick embryo have 
been converted to 
muscle cells by the 
experimentally 
induced expression of 
the myoD gene. The 
fibroblasts were 
grown in culture and 
transfected three days 
earlier with a 
recombinant DNA 
plasmid containing 
the myoD coding 
sequence. Although 'WiP- 
only a few percent of 
the fibroblasts take up the l>M4|f = 
produce the MyoD protein, litfij- 
have fused to form elongated 
myo tubes, which are sta incut hft§ 
with an antibody that deteeu % * 
muscle-specific protein. TlHt#fi| 
cells are intermixed with a <H0^- 
layer of fibroblasts, whose mg|fr 
barely visible in this micioH^K 1 
Control cultures transfecieil |Si|ffe 
another plasmid contain no i$$t? 
cells. (Courtesy of Stephen Tijift 
and Harold Weintraub.) - 



444 Chapter 9 : Control of Gene Expression 



® ® 

jpWx-loop-helix proteins— the so-called myogenic proteins (MyoD, Myf5, or 
-./)j|$mn, for example)— normally expressed only in muscle cells (Figure 9-62). 
lllag sites for these regulatory proteins can be detected in the regulatory DNA 
£iifthces adjacent to many muscle-specific genes. From studies in transgenic 
jiiS It seems likely that MyoD and Myf5 act by turning on myogenin: if the 
Y^pcnin gene is eliminated by targeted gene disruption, muscle cells fail to 
%M)ntiate. 

It Is probable that the fibroblasts and other cell types that are converted to 
il^lo cells by myogenic proteins have already accumulated a number of gene 
jlilMory proteins that can cooperate with the myogenic proteins to switch on 
P£U:-specific genes. In this view it is a specific combination of gene regulatory 
Steins, rather than a single protein, that determines muscle differentiation. This 
fei Is consistent with the finding that some cell types fail to be converted to 
,Wlc by myogenin or its relatives; these cells presumably have not accumulated 
I ) 01 her gene regulatory proteins required. 

A« we see next, combinatorial gene control has important implications for 
0 the evolution and the development of multicellular organisms. 

1 'dinbinatorial Gene Control Is the Norm in Eucaryotes 46 

$ have already discussed how multiple gene regulatory proteins can act in 
mhlnation to regulate the expression of an individual gene. But, as the example 
f Ok; myogenic proteins shows, combinatorial gene control means more than 
% not only does each gene have many gene regulatory proteins to control it, 
1! i'fflch regulatory protein contributes to the control of many genes. Moreover, 



embryonic cell 




mWG cellH cell I cell J cell K cell L cell M cell N 



Figure 9-63 The importance of 
combinatorial gene control for 
development. A highly schematic 
scheme illustrating how combinations 
of a few gene regulatory proteins can 
generate many cell types during 
development. In this simple scheme a 
"decision" to make one of a pair of 
different gene regulatory proteins 
(shown as numbered circles) is made 
after each cell division. Sensing its 
relative position in the embryo, the 
daughter cell toward the left side of 
the embryo is always induced to 
synthesize the even-numbered 
protein of each pair, while the 
daughter cell toward the right side of 
the embryo is induced to synthesize 
the odd- numbered protein. The 
production of each gene regulatory 
protein is assumed to be self- 
perpetuating (thereby contributing to 
cell memory). Therefore, the cells in 
the enlarging clone contain an 
increasing number of regulatory 
proteins. Note that, in this purely 
hypothetical example, eight cell types 
(G through N) have been created with 
5 different gene regulatory proteins. 
With continuation of such a scheme, 
more than 10,000 cell types could 
have been specified by only 25 
different gene regulatory proteins. 



pit? Molecular Genetic Mechanisms That Create Specialized Cell Types 



445 



although somegene^ 

a single cell type, more typcaHy J^ on 0 ^ es in ^ body, and at 

is itself switched on m a ^^^^a schematically in Figure 9- 
several times in development. ^°^ M ^ makes it possible to generate 
63, which shows how combinatorial gene comro y ^ 

a great deal of biological ^^.^^S^pxotei does not neces- 
With combinatorial control a P^^^^JiSiander of a particular 
sarily have a single, simply ^^ ^ ^ it may serve many 
battery of genes or specmer of a ^^^Xg^iory proteins. These pro- 
purposes that overlap with those H f n ^"^V e They are used with different 
teins can be likened to the words of J the well . ch osen combi- 

meanings in a variety ^^.^^^^ regu lat 0 ry event, 
nation that conveys the ^^^^^ ^that the effect of adding a 
A consequence of ^ ta " ato ^^^J°on the cell's past history, since 
new gene regulatory protein ^^^^^^ are already present, 
this history will determine wh ch gene P f reg ulatory pro- 

Thus during development a cell ™ »^^^theVl member of the 
teins that need not initiaUy alter gem expre smotuVW ^ ^ regu 

requisite combination of gene ^ a ™™£^^ to gene expression. Such 
latory message is completed, le* ^^^^ J ofa single regulatory 
a scheme, as we have seen, ^ d 4^*^[ c trans formation of the fibroblast 
protein to a fibroblast can produce ^to™™™^ discussed in 

Into a muscle cell. It also can account for Ac where a cell becomes 
Chapter 21. between the P^^tTa^^^ of «" 

committed to a particular de ^ lo P men ™^^ d character. It is an essen- 
Uation, where a c"^"^^^^ protein has been made, 

n^S" also has a, , important e^ncefor evolu- 

tion. Because gene regulator, 'J^^^^re^ protein can 
or to a particular target gene, a subtle ^change ir o g * a substantia l 
affect the expression pattern of many genes ana 
change in cell behaviors. 

An Inactive X Chromosome Is Inherited 47 

We saw earlier how gene regulatory ^^^^^^ 
gene expression in both procaryotic ^ 

somes in female mammalian cells chromo$om es of mammals: female 

The X and Y chromosomes " e * e **^°™Ztain one X and one Y chro- 
cells contain two X chromosomes rnak products would 

mosome. Presumably because a double dowrfXctao £ inac tivating 

be lethal, the female cells have ^^^^^Sta^ between the third 
one of the two X chromosomes m each cell. or other of the two X 
and the sixth day of de^ 

chromosomes in each cell becomes nigniy interph ase as a distinct struc- 

chromosome is seen in the bght ^roscope dunng ^ and ft licates 
ture known as a Ban body, located nea the Wear memo ^ rf 

late in S phase. Most f^fT^^^^^ composed of 



446 Chapters : Control of Gene Expression 



cell in early embryo 




DIRECT INHERITANCE OF THE PATTERN OF CHROMOSOME CONDENSATION 




only X m active in this clone only X p active in this clone 



^osome (Xp) is active and a roughly equal number of clonal groups of cells in 
i'^hlch only the maternally inherited X chromosome (Xm) is active. In general, the 
expressing Xp and those expressing Xn, are distributed in small clusters in 
$w ndult animal, reflecting the tendency of sister cells to remain close neighbors 
luring the later stages of embryonic development and growth (Figure 9-64). 

The process that forms the condensed chromatin (the heterochromatin) in 
j*i X chromosome tends to spread continuously along the chromosome. This can 
^ seen in studies with mutant animals in which one of the X chromosomes has 
Income joined to a portion of an autosome (a nonsex chromosome). In such 
hybrid chromosomes regions of the autosome adjacent to an inactivated X chro- 
mosome are often condensed into heterochromatin, causing the genes they con- 

to be inactivated in a heritable way. This suggests that X-chromosome in- 
Privation occurs by a cooperative process that can be thought of as a chromatin 
H tystallization" event that spreads linearly along the DNA from a nucleation site 
m the X chromosome. In fact, a unique inactivation center has been located 
.^iietically on the X chromosome: broken fragments of X chromosome do not 
ilMidergo inactivation unless they include this center. 

Once the condensed chromatin structure is established on an X chromo- 
tf»me, some unknown process causes the structure to be faithfully inherited 
f fMfing all subsequent replications of the DNA. The change is not absolutely per- 
iMtnent, however, as the condensed X chromosome is reactivated in the forma- 
tion of germ cells in the female. 



Figure 9-64 X inactivation. The 
clonal inheritance of a condensed 
inactive X chromosome that occurs in 
female mammals. 



tin* Molecular Genetic Mechanisms That Create Specialized Cell Types 



447 



barrier 

heterochromatin / euchromatin 



XL 



genes 

I 1 

1 2 3 4 5 
I I I I I I " 



CHROMOSOME 
TRANSLOCATION 

1 2 3 4 5 



IflftfaH^awiMfe / I 



heterochromatin euchromatin 



(A) 



1 2 3 4 5 

I I I I I I 



1 2 3 4 5 



1 2 3 4!. 
I I I I I J 



I I 

early in the developing embryo, heterochromatin forms and spreads into neighborly 
euchromatin to different extents in different cells 

* * * 

1 2 3 4 5 1 2 3 4 5 1 2 3 4 !. 

, V i i m i. -. m I- 71 IMP 



* 



~1 I I 



clone of cells with 
gene 1 inactive 



<B> 



cell proliferation 

t 




clone of cells with 
genes 1, 2, and 3 Inactive 




n i i t~ U 

clone of cells wild 
no genes inactiv«ti»»i 



Drosophila and Yeast Genes Can Also Be Inactivated 
by Heritable Features of Chromatin Structure 48 

Earlier we discussed two examples of position effects on gene expression— one in 
Drosophila and one in yeast— that seem in many ways to be analogous to X-chro- 
mosome inactivation (see Figure 9-51). In both cases a specifically condensed 
form of chromatin prevents the expression of genes, and in both cases the con- 
densed state of chromatin is heritable. 

In flies with chromosomal rearrangements, breaking and rejoining events 
that place the middle of a region of heterochromatin next to a region of normal 
chromatin {euchromatin) tend to inactivate the nearby euchromatic genes. The 
situation is analogous to fusing a mammalian autosome to an inactive X chro- 
mosome, as just described, and the inactivation events are similarly patterned: 
the zone of inactivation spreads from the chromosome breakpoint to involve one 
or more adjacent genes. Moreover, while the extent of the spreading effect is 
different in different cells, the inactivated zone established in an embryonic cell 
is stably inherited by all of the cell's progeny (Figure 9-65). The example of po- 
sition effect in yeast described previously in Figure 9-51 also shares some of the 
features of X-chromosome inactivation, including the spreading effect and the 
heritability of the condensed chromatin state. 

It has not yet been proved that X-chromosome inactivation and the position 
effects in flies and yeast all occur by related mechanisms. Nevertheless, the par- 
allels are striking. The recent identification and cloning of several Drosophila and 
yeast genes required for the position effects have provided an experimental entry 
point for exploring the molecular mechanisms involved. Figure 9-66 shows one 
hypothetical scheme that could, in principle, account for both the spreading 
effect and the heritable nature of the condensed chromatin state. 

Regardless of its molecular basis, the packing of selected regions of the ge- 
nome into condensed chromatin is a type of genetic regulatory mechanism that 
is not available to bacteria. The crucial feature of this uniquely eucaryotic form 
of gene regulation is the storing of the stable memory of gene states in an inher- 
ited chromatin structure rather than in a stable feedback loop of self-regulating 
gene regulatory proteins that can diffuse from place to place in the nucleus. 
Whether mechanisms of this type operate only to inactivate large regions of chro- 
mosomes or whether they can also operate at the level of one or a few genes is 
not known. 



Figure 9-65 Position-effect 
variegation in Drosophila. (A) 

Heterochromatin {red) is normally 
prevented from spreading into 
adjacent regions of euchromatin 
{green) by special barrier sequence-. M 
unknown nature. In flies that inhci it 
certain chromosomal translocation-* 
however, this barrier is no longer 
present. (B) During the early 
development of such flies, the 
heterochromatin now spreads inm 
neighboring chromosomal DNA, 
proceeding for different distances in 
different cells. The spreading soon 
stops, but the established pattern ni 
heterochromatin is inherited, so tlu>i 
large clones of progeny cells arc 
produced that have the same 
neighboring genes condensed inm 
heterochromatin and thereby 
inactivated (hence the "variegainr 
appearance of some of these flics: 
Figure 9-51B). This phenomenon 
shares many features with X- 
chromosome inactivation in 
mammals. 



The Pattern of DNA Methylation Can Be Inherited 
When Vertebrate Cells Divide 49 

The nucleotides in DNA can be covalently modified, and in vertebrate cells the 
methylation of cytosine seems to provide an important mechanism for distin- 



448 Chapter 9 : Control of Gene Expression 



inactive gene 

400O=i: 



active gene 
I— 1 



DNA REPLICATION 



41 



now protein added 
by cooperative 
binding I 



free protein 



DNA REPLICATION 



"A 



no protein 
binds 



4m 



(II Daughter genes are inactive 



BOTH DAUGHTER GENES ARE ACTIVE 



ig genes that are active from those that are not. The covalently modified 
mhylcytosine (5-methyl C) has the same relation to cytosine that thymine has 
'i^NiCJI "nd likewise has no effect on base-pairing (Figure 9-67). The methylation 
! i^tiobrate DNA is restricted to cytosine (C) nucleotides in the sequence CG, 
f€% Is base-paired to exactly the same sequence (in opposite orientation) on 
HMher strand of the DNA helix. Consequently, a simple mechanism permits 
4 minting pattern of DNA methylation to be inherited directly by the daugh- 
f !>NA strands. An enzyme called maintenance methylase acts preferentially on 
i)# CG sequences that are base-paired with a CG sequence that is already 
Jjhylated. As a result, the pattern of DNA methylation on the parental DNA 
i |MKl- will act as a template for the methylation of the daughter DNA strand, 
j&lftg this pattern to be inherited direcdy following DNA replication (Figure 9- 

V"' 

Bacteria produce enzymes that are useful for studying methylation in verte- 
.&>M* cells. They use the methylation of either an A or a C at a specific site to 
it#(0ct themselves from the action of their own restriction nucleases. The restric- 
\$\ nuclease Hpall, for example, cuts the sequence CCGG but fails to cleave it 

* ($10 central C is methylated. Thus the susceptibility of a DNA molecule to cleav- 
;p* i*y Hpall can be used to detect whether CG sequences at specific DNA sites 
|i fjncthylated. The inheritance of methylation patterns can be studied in ver- 
fifttie cells in culture by first using bacterial methylating enzymes to introduce 
jflliyl groups on cytosines and then using bacterial restriction nucleases to fol- 

!**#Mhc inheritance of these groups. The enzyme used to introduce 5-methyl C 

|M Into specific CG sequences is the Hpall-methylase that normally protects 
cVfoacterium against its own Hpall restriction nuclease. If this enzyme is used 
IllUithylate the central C in the sequence CCGG on a cloned DNA molecule that 
l iliiroduced into cultured vertebrate cells, the maintenance methylase can be 
'town to work as expected: each individual methylated CG is generally retained 
%i)iigh many cell divisions, whereas unmethylated CG sequences remain 
Jmethylated. 

The maintenance methylase explains the automatic inheritance of 5-methyl 
Nucleotides, but since it normally does not methylate fully unmethylated DNA, 

• I leaves unanswered the question of how the methyl group is first added in a 
,?H£brate organism. If a fully unmethylated DNA molecule is injected into a 

^iiiized mouse egg, methyl groups will be added to nearly every CG site (an 
Important exception will be described below). This is presumed to reflect the 

i.Wcnce of a novel establishment methylase activity in the egg. As we shall see, 

^ novo methylation can also occur during the differentiation of specialized cell 

u|jM)s, although it is not known how it occurs. 



Figure 9-67 Formation of 5-methyl-cytosine occurs by methylation of a 
cytosine base in the DNA double helix. In vertebrates this event is confined 
to selected cytosine (C) nucleotides located in the sequence CG. 



Figure 9-66 A general scheme that 
permits the direct inheritance f 
states of gene expression during DNA 
replication. In this hypothetical 
model* portions of a cooperatively 
bound cluster of chromosomal 
proteins are transferred direcdy from 
the parental DNA helix {top left) to 
both daughter helices. The inherited 
cluster then causes each of the 
daughter DNA helices to bind 
additional copies of the same 
proteins. Because the binding is 
cooperative, DNA synthesized from 
an identical parental DNA helix that 
lacks the bound proteins {top right) 
will remain free of them. If the bound 
proteins turn off gene transcription, 
then the inactive gene state will be 
directly inherited, as illustrated. If the 
cooperative protein binding requires 
specific DNA sequences, these events 
will be limited to specific gene control 
regions; if the binding can be 
propagated all along the chromo- 
some, however, it could account for 
the spreading effect associated with 
the heritable chromatin states 
discussed in the text. 



cytosine 



5-methylcytosine 




tft*£ Molecular Genetic Mechanisms That Create Specialized Cell Types 



449 



methylated 
cytosine 



unmethylated 

cytosine \^ CH3 

A C G T A T C'G T 

5 S=5S55SSS" 

3' . 5 

TGCATAGCA 
I 

CH 3 



DNA 

replication 



CH 3 

I 

ACGTATCGT 
5' 3' 

SSSSSSiSi 

3' 5' 
TGCATAGCA 



methylation 



A C G T A 1 

5 " ■ W ■ f ■ " * 

3' 

T G C A T A 



ACGTATCGT A C G T A I 

— — ____ 3' methylation 5 



3. - 5' 
TGCATAGCA 



T G C A T A 



CH 3 



DNA Methylation Reinforces Developmental Decisions 
in Vertebrate Cells 50 

Although DNA methylation was once proposed to play a dominant part in gen- 
erating different mammalian cell types, it is now viewed as having a more subtle 
role. In some invertebrates, including Drosophila, DNA methylation does not 
occur, yet the control of gene expression and the diversification of cell types 
appear to be similar in Drosophila and vertebrates. Several observations are con- 
sistent with the idea that DNA methylation in vertebrates is associated with gene 
inactivation but that it usually only reinforces decisions that are first brought 
about by other mechanisms. Thus tests with the Hpall restriction nuclease in- 
dicate that in general the DNA of inactive genes is more heavily methylated than 
that of active genes. When an inactive gene that contains methylated DNA is 
turned on during the course of normal development, however, it generally loses 
most of its methyl groups only after the gene has been activated. Conversely, the 
female X chromosome, discussed above, is first condensed and inactivated and 
only later acquires an increased level of methylation on some of its genes. 

What, then, does methylation do, and why is it useful to the organism? There 
are at least two important clues. First, the DNA corresponding to a muscle-spe- 
cific actin gene can be prepared in both its fully methylated and its fully 
unmethylated form. When these two versions of the gene are introduced into 
cultured muscle cells, both are transcribed at the same high rate. When, however, 
they are introduced into fibroblasts, which normally do not transcribe the gene, 
the unmethylated gene is transcribed at a low rate, whereas neither the exog- 
enously added methylated gene nor the endogenous gene of the fibroblast (which 
is also methylated) is transcribed at all. Second, biochemical experiments have 
identified a vertebrate protein that binds tightly to DNA that contains clustered 
5-methyl C nucleotides. The binding of this protein is thought to package the 
methylated DNA in a way that makes it unusually resistant to the transcriptional 
activation machinery. These two observations suggest that DNA methylation is 
used in vertebrates mainly to ensure that once a gene is turned off, it stays com- 
pletely off (Figure 9-69). 

Experiments designed to test whether a DNA sequence that is transcribed at 
high levels in one vertebrate cell type is transcribed at all in another have dem- 
onstrated that rates of gene transcription can differ between two cell types by a 
factor of more than 10 6 . Thus unexpressed vertebrate genes are much less "leaky" 
in terms of transcription than are unexpressed genes in bacteria, in which the 



. HowDNAmrih\i 
patterns are faithfully inhn -i i< 

vertebrate DNAs a large fr:n ■ • 
the cytosine nucleotides inil» 
sequence CG are methylainl 
Figure 9-67) . Because of 1 1 u ■ ■ ■ ■ 
of a methyl-directed methyl. h > 
enzyme (the maintenance nm-i 
once a pattern of DNA nwih\i. 
established, each site of mnir, 
is inherited in the progeny I >' . 
shown. This means that c 1 1 ; i n j 
DNA methylation patterns w ill 
perpetuated in all of the prop 
cell. 



450 Chapter 9 : Control of Gene Expression 



gene regulatory general 

proteins transcription factors 




LOSS OF GENE 
REGULATORY PROTEINS 



GENE 
ON 



NEW DNA METHYLATION 



GENE OFF 
BUT LEAKY 



Figure 9-69 How DNA methylation 
may help turn off genes. The binding 
of gene regulatory proteins and 
general transcription factors near an 
active promoter prevents DNA 
methylation by some unknown 
mechanism. If most of these 
sequence-specific DNA-binding 
proteins dissociate, however, as 
generally occurs when a gene is 
turned off, the DNA becomes 
methylated, which enables other 
proteins to bind, and these shut down 
the gene completely. 



OOOOOOG . 

666666 



BINDING OF PROTEINS 
THAT RECOGNIZE METHYL C 



TJWMTO^T off 



^ ktmvvn differences in transcription rates between expressed and unex- 
f^ne states are about 1000-fold. DNA methylation of unexpressed ver- 

?^ f;*jM(*s may account for at least part of this difference. In addition, as we 
*H>,\i t DNA methylation is required for at least one special type of cellu- 

*>|tvilc Imprinting Requires DNA Methylation 51 

--■^jlim cells are diploid, containing one set of genes inherited from the fa- 
r !f! (Mi c set from the mother. In a few cases the expression of a gene has 
^£j§imi lo depend on whether it is inherited from the mother or the father. 
^iHttncnon is called genomic imprinting. Although not originally discov- 
1 ^Ihiii way, genomic imprinting has been dramatically illustrated in experi- 
ffti iumsgenic mice. It is possible, for example, to make transgenic mice in 
'-)Mw <>i the two normal copies of the gene coding for insulinlike growth 
;J# (J(>l : '2) has been inactivated by mutation. These heterozygous mice 
- ■ | ftormally if it is the maternally derived Igf-2 gene that is defective, 
if i In; paternally derived Igf-2 gene is defective, they are stunted, growing 
viImmi half the size of normal mice. Further analysis of these and normal 
; !|Hili:il an explanation. In both the transgenic and the wild-type mice only 
f n ^fVolly derived Igf-2 gene is transcribed, while the maternally derived gene 
- .^■liit' maternally derived gene in this case is said to be imprinted. 
^HIkI» t lie mechanism of imprinting is uncertain, it seems very likely that 
JHyliition is involved. Thus, in transgenic mice defective in the mainte- 
> vMhylase, the imprinting of the maternal Igf-2 gene does not occur, im- 
.jt tin: mechanism that distinguishes between the paternal and mater- 
:t of lite lgf-2 gene requires DNA methylation. Interestingly, the mice 
i ii^M maintenance methylase die as young embryos. This could result from 
' imprinting, but it is also conceivable that a failure to reinforce devel- 
jl incisions by methylation is the primary defect, leading to leaky tran- 
v jf of tin; many thousands of genes that are normally turned off in each 



^ >fy|iliu Genetic Mechanisms That Create Specialized Cell Types 



451 



CG-^rich Islands Are Associa 
40,000 Genes in Mammals 52 



with About 



Because of the way DNA repair enzymes work, methylated C nucleotides in the 
genome tend to be eliminated in the course of evolution. Accidental deamina- 
tion of an unmethylated C gives rise to U, which is not normally present in DNA 
and thus is recognized easily by the DNA repair enzyme uracil DNA glycosylase, 
excised, and then replaced with a C (discussed in Chapter 6). But accidental 
deamination of a 5-methyl C cannot be repaired in this way, for the deamination 
product is a T and so indistinguishable from the other, nonmutant T nucleotides 
in the DNA. Although a special repair system exists to remove these mutant Ts 
(see p. 250), many of the deaminations escape detection, so that those C nucle- 
otides in the genome that are methylated tend to mutate to T over evolutionary 
time. 

During the course of evolution, more than three out of every four CGs have 
been lost in this way, leaving vertebrates with a remarkable deficiency of this 
dinucleotide. The CG sequences that remain are very unevenly distributed in the 
genome; they are present at 10 to 20 times their average density in selected re- 
gions, called CG islands, that are 1000 to 2000 nucleotide pairs long. These is- 
lands, with some important exceptions, seem to remain unmethylated in all cell 
types. They are thought to surround the promoters of the so-called housekeep- 
ing genes — those genes that code for the many proteins that are essential for cell 
viability and are therefore expressed in most cells (Figure 9-70). In addition, many 
tissue-specific genes, which code for proteins needed only in selected types of 
cells, are also associated with CG islands. 

The distribution of CG islands can be explained if we assume that CG me- 
thylation was adopted in vertebrates as a way of hindering the initiation of tran- 
scription in inactive segments of the genome (Figure 9-71). In the germ line of 
vertebrates — the cell lineage giving rise to eggs and sperm — most of the genome 
is inactive and methylated. Over long periods of evolutionary time, the methy- 
lated CG sequences in these inactive regions have presumably been lost through 
accidental deamination events that were not correctly repaired. The CG se- 
quences in the regions surrounding the promoters of many genes, however, in- 
cluding all housekeeping genes, are kept demethylated in cells of the germ line, 
and so they can be readily repaired after spontaneous deamination events. Such 
regions are preserved as CG islands. 

The mammalian genome (about 3 x 10 9 nucleotide pairs) contains an esti- 
mated 40,000 CG islands. Most of the islands mark the 5' ends of a transcription 
unit and thus, presumably, a gene. It is possible to clone specifically the DNA 
surrounding the CG islands, and this technique provides a convenient way of 
finding new genes. 



CG island 



/ introns exons 

i — H ^>^\ dihydrofolate reductase gene 



DNA 



RNA 



hypoxanthine phosphoribosyl transferase gene 



DNA 



RNA 



=>3' 



ribosomal protein gene 



DNA 



RNA^ 
5'==£> 3' 



10,000 nucleotide pairs 



Figure £-70 The CG islands 
surrounding the promoter in H»»*h 
mammalian housekeeping ^t*nm 
The yellow boxes show the c.\ir»i 4 
each island. Note also that, as u» 
most genes in mammals, the r\^m. 
{dark red) are very short relai iw ^ 
the introns {light red). (Adapird im# 
A.P. Bird, Trends Genet. 3:342 m: 
1987.) 



452 Chapter 9 : Control of Gene Expression 



, VERTEBRATE ANCESTOR 



^^^^ 



methylation of most 
CG sequences in 
germ line 



5' C 



RNA 



ill II ii I »a -aw ti- «-m whu biihi ■■ hhh MM-M - 1 ii nmnMilWMmi wnipB «yniiHMm ■ ma mmh i faBi iiiiihi u-m i - j iCT inm-nmr - 



5' E 



-2> 



many millions of years 
of evolution 

VERTEBRATE DNA 

-l H i ih - Ml h i i iti i i.ifitHHimm i n afflfediimiiiMit itfe^HmttUHbiiiMh i tiHii^n i h » ■ im- ^^h ^ fti 



1000 nucleotide pairs 



J 



CG island 



Figure 9-71 A mechanism to explain 
both the marked deficiency of CG 
sequences and the presence f CG 
islands in vertebrate genomes. A 

black line marks the location of an 
unmethylated CG dinucleotide in the 
DNA sequence, while a red line marks 
the location of a methylated CG 
dinucleotide. 



Summary 

fhv. many types of cells in animals and plants are created largely through mecha- 
nisms that cause different genes to be transcribed in different cells. Since many spe- 
cialized animal cells can maintain their unique character when grown in culture, the 
$tme regulatory mechanisms involved in creating them must be stable once estab- 
Ushed and heritable when the cell divides, endowing the cell with a memory of its 
developmental history. Procaryotes and yeasts provide unusually accessible model 
$yHems in which to study gene regulatory mechanisms, some of which may be rel- 
evant to the creation of specialized cell types in higher eucaryotes. One such mecha- 
lilm involves a competitive interaction between two (or more) gene regulatory pro- 
ivlns, each of which inhibits the synthesis of the other; this can create a flip-flop 
switch that switches a cell between two alternative patterns of gene expression. Di- 
m-t or indirect positive feedback loops, which enable gene regulatory proteins to 
jwpetuate their own synthesis, provide a general mechanism for cell memory. 

In eucaryotes gene transcription is generally controlled by combinations of gene 
ftptlatory proteins. It is thought that each type of cell in a higher eucaryotic organism 
tyn tains a specific combination of gene regulatory proteins that ensures the expres- 
ihn of only those genes appropriate to that type of cell A given gene regulatory pro- 
fain may be expressed in a variety of circumstances and typically is involved in the 
regulation of many genes. 

In addition to diffusible gene regulatory proteins, inherited states of chromatin 
iimdensatian are also utilized by eucaryotic cells to regulate gene expression. In ver- 
tebrates DNA methylation also plays a part, mainly as a device to reinforce decisions 
tihout gene expression that are made initially by other mechanisms. 



Posttranscriptional Controls 

Although controls on the initiation of gene transcription are the predominant 
form of regulation for most genes, other controls can act later in the pathway 
* from RNA to protein to modulate the amount of gene product that is made. Al- 
Oiough these posttranscriptional controls, which operate after RNA polymerase 
)M*s bound to the gene's promoter and begun RNA synthesis, are less common 
limn transcriptional control for many genes they are crucial. It seems that every 
H'Usp in gene expression that could be controlled in principle is likely to be regu- 
lated under some circumstances for some genes. 

We consider the varieties of posttranscriptional regulation in temporal or- 
(lOr, according to the sequence of events that might be experienced by an RNA 
molecule after its transcription has begun (Figure 9-72). 



START RNA 
TRANSCRIPTION 



POSSIBLE 
ATTENUATION 

SPLICING 
AND 3*-END 
T CLEAVAGE 

NUCLEAR 
EXPORT 



SPATIAL 
LOCALIZATION 
IN CYTOPLASM 

POSSIBLE 
RNA EDITING 

START 
TRANSLATION 

POSSIBLE 
TRANSLATIONAL 
RECODING 

POSSIBLE 
RNA 
STABILIZATION 



RNA 
* transcript 
aborts 

nonfunctional 
mRNA 
sequences 

retention 
in nucleus 



1 ) 



translation 
blocked 



RNA degraded 



CONTINUED 
PROTEIN SYNTHESIS 



Figure 9-72 Possible post- 
transcriptional controls on gene 
expression. Only a few of these 
controls are likely to be used for any 
one gene. 



posttranscriptional C ntrols 



453 



Transcription AttenuatffJfi Causes the Premature 
Termination of Some RNA Molecules 53 

In bacteria the expression of certain genes is inhibited by premature termination 
of transcription, a phenomenon called transcription attenuation. In some of 
these cases the nascent RNA chain adopts a structure that causes it to interact 
with the RNA polymerase in such a way as to abort its transcription. When the 
gene product is required, regulatory proteins bind to the nascent RNA chain and 
interfere with attenuation, allowing the transcription of a complete RNA mol- 
ecule. 

In eucaryotes transcription attenuation can occur by a number of distinct 
mechanisms. In both adenovirus and HIV (the human AIDS virus), for example, 
the proteins that assemble at the promoter seem to determine whether or not the 
polymerase will be able to pass through specific sites of attenuation downstream. 
These proteins can differ from one cell type to the next, and the cell can control 
the degree of attenuation for particular genes. 

Alternative RNA Splicing Can Produce Different Forms 
of a Protein from the Same Gene 54 

As discussed in Chapter 8, many genes are first transcribed as long mRNA pre- 
cursors that are then shortened by a series of processing steps to produce the 
mature mRNA molecule. One of these steps is RNA splicing, in which the intron 
sequences are removed from the mRNA precursor. Often a cell can splice the 
primary transcript in different ways and thereby make different polypeptide 
chains from the same gene— a process called alternative RNA splicing (Figure 
9-73). A substantial proportion of higher eucaryotic genes produce multiple pro- 
teins in this way. When different splicing possibilities exist at several positions 
in the transcript, a single gene can produce dozens of different proteins. Usually, 
however, the splice alternatives are more limited, and only a few kinds of pro- 
teins are synthesized from each transcription unit. 

In some cases alternative RNA splicing occurs because there is an "intron 
sequence ambiguity": the standard spliceosome mechanism for removing intron 
sequences (discussed in Chapter 8) is unable to distinguish cleanly between two 
or more alternative pairings of 5' and 3' splice sites, so that different choices are 
made haphazardly on different occasions. Where such constitutive alternative 
splicing occurs, several versions of the protein encoded by the gene are made in 
all cells in which the gene is expressed. 



optional exon 




optional intron 




mutually exclusive exons 




interna! splice site 




Four patterns o i 
alternative RNA splicing. In 

a single type of RNA trans* 1 1 1 - 
spliced in two alternative vv;r 
produce two distinct mRNA 
2). The dark blue boxes nun l - 
sequences that are retained u 
mRNAs. The light blue boM- ■ 
possible exon sequences th;n 
included in only one of th( i i ■ i 
these boxes are joined by ml 
indicate where intron sequi n 
{yellow) are removed. (Adapi« 
permission from A. Andread i 
Gallego, and B. Nadal-Ginan! 
Rev. Cell Biol. 3:207-242, lf)». 
Annual Reviews, Inc.) 



454 Chapter 9 : Control of Gene Expression 



: ; TISSU 



MNI-GATIVE P rim ^ transcript 



UFl 



iDNTROL 



splicing 



1 ) mRNA 



>msmvE P rim arv trans cript 



(PNTROL 



no splicing 



I I T 



3 mRNA C 



TISSUE 2 

repressor 



CD 



JZ 



splicing blocked 



I I 



I I I 



activator 

/2> 



splicing 



□ mRNA 



I mRNA 



Figure 9-74 Negative and positive 
control of alternative RNA splicing. 

(A) Negative control, in which a 
repressor protein binds to the primary 
RNA transcript in tissue 2, thereby 
preventing the splicing machinery 
from removing an intron sequence. 

(B) Positive control, in which the 
splicing machinery is unable to 
remove a particular intron sequence 
without assistance from an activator 
protein. 



H\ many cases, however, alternative RNA splicing is regulated rather than 
MMulive. In the simplest examples regulated splicing is used to switch from 
mduction of a nonfunctional protein to the production of a functional one. 
nilnnsposase that catalyzes the transposition of the Drosophila P element, for 
7>lt\ is produced in a functional form in germ cells and a nonfunctional form 
■jMtntic cells of the fly, allowing the P element to spread throughout the 
.UtM* of the fly without causing damage in < Somatic cells. The difference in 
!jH»«on activity has been traced to the presence of an intron sequence in the 
JJH)Knse RNA that is removed only in germ cells. 

JHA splicing can be regulated either negatively, by a regulatory molecule that 
7J.Mtf the splicing machinery from gaining access to a particular splice site on 
$*A, or positively, by a regulatory molecule that directs the splicing machin- 
3im otherwise overlooked splice site (Figure 9-74). In the case of the Droso- 
iMiuisposase, the key splicing event is blocked in somatic cells by negative 
Ittlon. 

ijjft addition to switching from the production of a functional protein to the 
%(ion of a nonfunctional one, the regulation of RNA splicing can generate 
. JWH versions of a protein in different cell types, according to the needs of 
ill, The tyrosine protein kinase encoded by the src proto-oncogene, for ex- 
\ Is produced in a specialized form in nerve cells by this mechanism (Figure 
i Cdl-type-specific forms of many other proteins are produced in the same 



arrangement of protein-coding exons in the src gene 



2 3 



DNA 



1000 
nucleotide pairs 



H 2 N 



most cells 

INI.I I II I I 



6 7 8 9 10 1112 



nerve cells 



COOH H 2 N[- | |l 



COOH 



2 3 45 6 7 8 9 10 11 12 
Src protein of 533 amino acids 



2 3A4 5 6 7 8 9 10 11 12 
Src protein of 539 amino acids 



Figure 9-75 Regulated alternative 
RNA splicing produces cell-type- 
specific forms of a gene product. 
Here two slightly different tyrosine 
protein kinases are produced from 
the src gene because exon sequence A 
is included only in nerve cells. The 
neural form of the Src protein 
contains an extra site for phospho- 
rylation and is also thought to have a 
higher specific activity. Only the 
protein- coding exons {colored) are 
shown in this diagram (exon 1, which 
forms the 5' leader on the mRNA, is 
not shown). (After J.B. Levy et al„ Mol 
Cell Biol 7:4142-4145, 1987.) 



jlirtcriptional Controls 



455 



Sex Determination in Drosophila Depends 
on a Regulated Series of RNA Splicing Events 55 

In Drosophila the primary signal for determining whether the fly develops as a 
male or female is the X chromosome/autosome ratio. Individuals with an X chro- 
mosome /autosome ratio of 1 (normally two X chromosomes and two sets of 
autosomes) develop as females, while those with a ratio of 0.5 (normally one X 
chromosome and two sets of autosomes) develop as males. This ratio is some- 
how assessed early in development and is remembered by each cell thereafter. 
Three crucial gene products are involved in transmitting information about this 
ratio to the many other genes that specify male and female characteristics (Figure 
9-76). As explained in Figure 9-77, sex determination in Drosophila depends on 
a cascade of regulated RNA splicing events that involves these three gene prod- 

UCt Drosophila sex determination provides the best-understood example of 
a regulatory cascade based on RNA splicing. It is not clear why the fly should 
use this strategy. Other organisms (the nematode, for example) use an entirely 
different scheme for sex determination— one based on transcriptional and trans- 
itional controls. Moreover, the Drosophila male-determination pathway requires 
that a number of nonfunctional RNA molecules be continually produced, which 
seems unnecessarily wasteful. One speculation is that this RNA-splicing cascade 
is an ancient control device, left over from a stage of evolution where RNA was 
the predominant biological molecule and controls of gene expression had to be 
based almost entirely on RNA-RNA interactions. 

A Change in the Site of RNA Transcript Cleavage 

and Poly- A Addition Can Change the Carboxyl Terminus 

of a Protein 56 

In eucaryotes the 3' end of an mRNA molecule is not determined by the termi- 
nation of RNA synthesis by the RNA polymerase as it is in bacteria. Instead, it is 
determined by an RNA cleavage reaction that is catalyzed by additional factors 
while the transcript is elongating (see Figure 8-49). A cell can control the site of 
this cleavage so as to change the carboxyl terminus of the resultant protein 
(which is encoded by the 3' end of the mRNA). 

A well-studied example is the switch from the synthesis of membrane-bound 
to secreted antibody molecules that occurs during the development of B lympho- 
cytes. Early in the life history of a B lymphocyte, the antibody it produces is an- 
chored in the plasma membrane, where it serves as a receptor for antigen. An- 
tigen stimulation causes these cells to multiply and to start secreting their 
antibody. The secreted form of the antibody is identical to the membrane-bound 
form except at the extreme carboxyl terminus. In this part of the protein the 
membrane-bound form has a long string of hydrophobic amino acids that 
traverses the lipid bilayer of the membrane, whereas the secreted form has a 
much shorter string of hydrophilic amino acids. The switch from membrane- 
bound to secreted antibody therefore requires a different nucleotide sequence 
at the 3' end of the mRNA; this difference is generated through a change in the 
length of the primary RNA transcript caused by a change in the site of RNA cleav- 
age, as described in Figure 9-78. 



X chromosome 
autosome 



ratio 



Wgeneprodffl; 




Pi p...: :■. ; C Sex determination in 
Drosophila. The gene products 
shown act in a sequential cascade i<> 
determine the sex of the fly accord in): 
to the X chromosome/autosome raiin 
The genes are called sex-lethal (Sxl). 
transformer (tra), and doublesex (ds.\ i 
because of the phenotypes that resnli 
when the gene is inactivated by 
mutation. The function of these gem 
products is to transmit the 
information about the X chromo- 
some/autosome ratio to the many 
other genes that are involved in 
creating the sex-related phenotypes 
These other genes function as two 
alternative sets: those that specify 
female features and those that spot its 
male features (see Figure 9-77). 



The Definition of a Gene Has Had to Be Modified Since 
the Discovery of Alternative RNA Splicing 57 

The discovery that eucaryotic genes usually contain introns and that their cod- 
ing sequences can be put together in more than one way raised new questions 
about the definition of a gene. A gene was first clearly defined in molecular terms 
in the early 1940s from work on the biochemical genetics of the fungus Neuro- 
spora. Until then, a gen had been defined operationally as a region of the ge- 



456 Chapter 9 : Control of Gene Expression 



\ "GENE , ; . 



.MALE primary RNA transcript 

V ; ; X : A = 0;5 



FEMALE primary RNA transcript] 
' - X:A=1 . .. 



regulated 3' splice site 



i 

nonfunctional protein produced 



regulated 3' splice site 



3* 



splice site 
blocked 




functional Sxl 
protein 



nonfunctional protein produced 



regulated 3' splice site 



Ezn 



splice site blocked 

i O i ■■ A -j 



functional 
I Tra protein 



Tra-2 




protein 



400 aa 150 aa that are 
| male-specific 

REPRESSES FEMALE 
DIFFERENTIATION GENES 



MALE DEVELOPMENT 




Dsx 
protein 



400 aa 30 aa that are 
| female-specific 

REPRESSES MALE 
DIFFERENTIATION GENES 



FEMALE DEVELOPMENT 



iiome that segregates as a single unit during meiosis and gives rise to a definable 
"phunotypic trait, such as a red or a white eye in Drosophila or a round or wrinkled 
ptiKl in peas. The work on Neurospora showed that most genes correspond to a 
mpon of the genome that directs the synthesis of a single enzyme. This led to the 
hypothesis that one gene encodes one polypeptide chain. The hypothesis proved 
fruitful for subsequent research; and, as more was learned about the mechanism 
1?f ftene expression in the 1960s, a gene became identified as that stretch of DNA 
$1)0 1 was transcribed into the RNA coding for a single polypeptide chain (or a 
' Wfttflc structural RNA such as a tRNA or an rRNA molecule). The discovery of split 
$pnes in the late 1970s could be readily accommodated by the original definition 
i*( « gene, provided that a single polypeptide chain was specified by the RNA tran- 
scribed from any one DNA sequence. But it is now clear that many DNA sc- 
iences in higher eucaryotic cells produce two or more distinct proteins by 
-Ihcuns of alternative RNA splicing. How then is a gene to be defined? 

In those relatively rare cases in which two very different eucaryotic proteins 
4r<J produced from a single transcription unit, the two proteins are considered 
\tibe produced by distinct genes that overlap on the chromosome. It seems 
Unnecessarily complex, however, to consider most of the protein variants pro- 
sliced by alternative RNA splicing as being derived from overlapping genes. A 
ittOre sensible alternative is to modify the original definition to include as a gene 

DNA sequence that is transcribed as a single unit and encodes one set of 
fr lonely related polypeptide chains {protein isoforms). This definition of a gene 
ilteo accommodates those DNA sequences that encode protein variants produced 
hf posttranscriptional processes other than RNA splicing, such as ribosomal 
frftmeshifting and RNA editing, which we discuss later. 



Figure 9-77 The cascade f changes 
in gene expression that determines 
the sex of a fly depends n 
alternative RNA splicing. An X 

chromosome /autosome ratio of 0.5 
results in male development. Male is 
the "default" pathway in which the 
Sxl and tra genes are both 
transcribed, but the RNAs are spliced 
constitutively to produce only 
nonfunctional RNA molecules, and 
the dsx transcript is spliced to 
produce a protein that turns off the 
genes that specify female 
characteristics. An X chromosome/ 
autosome ratio of 1 triggers the 
female differentiation pathway in the 
embryo by transiently activating a 
promoter within the Sxl gene that 
causes a functional Sxl protein to be 
synthesized. Sxl is a splicing 
regulatory protein with two sites of 
action: (1) it binds to a constitutively 
produced Sxl RNA transcript, causing 
a female- specific splice that continues 
the production of a functional Sxl 
protein, and (2) it binds to the 
constitutively produced tra RNA and 
causes an alternative splice of this 
transcript, which now produces an 
active Tra regulatory protein. The Tra 
protein acts with the constitutively 
produced Tra-2 protein to produce 
the iemale- specific spliced form of 
the dsx transcript; this encodes the 
female form of the Dsx protein, which 
turns off the genes that specify male 
features. The components in this 
pathway were all initially identified 
through the study of Drosophila 
mutants that are altered in their 
sexual development. The dsx gene, for 
example, derives its name {doublesex) 
from the observation that a fly lacking 
this gene product expresses both 
male- and female-specific features. 
Note that, whereas both the Sxl and 
Tra proteins bind to specific RNA 
sites, Sxl is a repressor that acts 
negatively to block a splice, whereas 
the Tra proteins are activators that act 
positively to induce a splice (see 
Figure 9-74). 



O*0tit transcriptional Controls 



457 



RNA is cleaved here for RNA is cleaved here for 
short transcript long transcript 



5' splice site (donor) I 3' splice site (acceptor) 

I _J I 



2Z 



ZX 



HE 



r 



TRANSCRIPTION 

) I 



1 



■ LONG RNA TRANSCRIPT j 



[J5HORT RNA TRANSCRIPT 



stop codon I 



stop codon II , 



donor 


acceptor 

1 




i : H i 


* 1 


- if 


) 




1 J 



stop codon I 
donor 



] AAAAAA 3' 



O 1 AAA AAA I' 



[ nriRNAj 



intron sequence removed 
by RNA processing 



stop codon II 

Q I AAAAAA 3' 



intron sequences not 
removed because acceptor 
splice junction is missing 



stop codon I 
donor 



:ze 



J AAAAAA P 



TRANSLATION 



TRANSLATION 



MEMBRANE-BOUND ANTIBODY 



SECRETED ANTIBODY 



D-cooh 



ZHcooh 



I 

terminal hydrophobic peptide 



RNA Transport from the Nucleus Can Be Regulated 58 

An average primary RNA transcript seems to be at most 10 times longer than the 
mature mRNA molecule generated from it by RNA splicing. Yet it has been es- 
timated that only about one-twentieth of the total mass of RNA made ever leaves 
the nucleus. It seems, therefore, that a substantial fraction of the primary tran- 
scripts (perhaps half) may be completely degraded in the nucleus without ever 
generating an RNA molecule that reaches the cytoplasm. The discarded RNAs 
may consist of sequences that are never made into an mRNA molecule; on the 
other hand, some may represent potential mRNA molecules that are functional 
in some cell types but in others fail to get delivered to the cytoplasm. This might 
be either because they are selectively targeted for intranuclear degradation or 
because their exit from the nucleus is selectively blocked. 

Although there is very litde solid evidence for either form of control, each of 
them remains a real possibility. In particular, RNA export through the nuclear 
pores is an active process, and for most RNAs it requires a specific nucleotide cap 
at the 5' end of the RNA molecule and a poly-A tail at the 3' end (discussed in 
Chapter 8). A requirement of this type makes sense, since it keeps junk RNA frag- 
ments (such as the intron sequences removed by RNA splicing) out of the cytosol. 
But having the proper types of ends is not enough for transport: each mRNA 
precursor molecule remains tethered to sites inside the nucleus until all of the 
spliceosome components have dissociated from it (see Figure 8-54). Therefore, 
any mechanism that prevents the completion of RNA splicing on particular RNA 
molecules could, in principle, block the exit of those RNAs from the nucleus. 



terminal hydrophilic peptide 



Figure 9-78 Regulation of the site ill 
RNA cleavage and poly-A addition 
determines whether an antibody 
molecule is secreted or remains 
membrane-bound. In unstimulated fi 
lymphocytes {left) a long RNA 
transcript is produced, and the introiV 
sequence near its 3' end is removed 
by RNA splicing to give rise to an 
mRNA molecule that codes for a 
membrane-bound antibody molccul| 
In contrast, after antigen stimulation 
{right) the primary RNA transcript H 
cleaved upstream from the splice she 
in front of the last exon sequence. A* 
a result, some of the intron sequence 
that is removed from the long 
transcript remains as coding 
sequence in the short transcript. 
These are the sequences that encode 
the hydrophilic carboxyl-terminal 
portion of the secreted antibody 
molecule. 



join 

">j# ill 

3 t*xp 

m) i) 

,111 be 
■>f rlbc 

! ft Olh* 
JMl Nil 
i0\in i 
In 

vJMlon 
flftmsi* 

iiop c( 
iji}jure 
OfM'od; 

iWHNA 
Whin pi 
!>5tihry< 
fnKNA: 

phiilinn 
fajjomi 
tot'Oriuc 

in this > 
lyiopla 



Orot 
ipuulntt 

XZ1 



(A) 



P(>IC/)I 



hfipnl 



458 Chapter9 : Control of Gene Expression j\ hwtlran; 



[ized to Specific Regions 



%*ne mRNAs Are Lo 
.4? the Cytoplasm 59 

4H«H? n newly made eucaryotic mRNA molecule has passed through a nuclear 
*j3ri* and entered the cytosol, it encounters ribosomes that translate it into a 
jprtypeptide chain. If the mRNA encodes a protein that is destined to be secreted 
f-tf expressed on the cell surface, it will be directed to the endoplasmic reticulum 
rtll) by a signal sequence at the protein's amino terminus; the signal sequence 
M be recognized as soon as it emerges from the ribosome by components of the 
\#ir.8 protein-sorting apparatus. This apparatus then directs the entire complex 
4 ribosome, mRNA, and nascent protein to the membrane of the ER, where the 
#iHainder of the polypeptide chain is synthesized, as discussed in Chapter 12. 
to Wilier cases the entire protein is synthesized by free ribosomes in the cytosol, 
■;jfl<1 Signals in the completed polypeptide chain may then direct the protein to 
ujlhiir sites in the cell. 

In still other cases, however, mRNAs are directed to specific intracellular lo- 
jgMtons by signals in the mRNA sequence itself, before the sequence has been 
jMnslated into an amino acid sequence. The signal is typically located in the 3' 
'Untranslated region (UTR) of the mRNA molecule— a region that extends from the 
Stop codon, which terminates protein synthesis, to the start of the poly-A tail (see 
jftjjure 9-78). A striking example is seen in the Drosophila egg, where the mRNA 
;dftOOding the bicoid gene regulatory protein is attached to the cortical cyto- 
5lt)lijton at the anterior tip of the developing egg. When the translation of this 
MINA is triggered by fertilization, a gradient of the bicoid protein is generated 
4m. piays a crucial part in directing the development of the anterior part of the 
Wit>ryo (shown in Figure 9-39 and discussed in more detail in Chapter 21). Some 
pRNAs in somatic cells are localized in a similar way. The mRNA that encodes 
"£H!l;ln, for example, is localized to the actin-filament-rich cell cortex in mam- 
tnnlian fibroblasts by means of a 3' UTR signal, presumably because it is advan- 
.lHjfcous for the cell to position its mRNAs close to the sites where the protein 
JiftKluced from the mRNA is required. This form of posttranscriptional gene regu- 
tiilion, where mRNA is specifically localized to one part of the cell, has been 
^cognized only recently, and it is still unclear how many mRNAs are localized 
)ft this way. The role of the UTR in localizing mRNAs to a particular region of the 
liyioplasm is illustrated in Figure 9-79. 



Drosophila 
(ovulatory region 



bacterial gene for 
(3-galactosidase 
{lacZ gene) 



2H 



test 3' 
UTR 



m 



RECOMBINANT DNA SEQUENCE 
INSERTED INTO 
DROSOPHILA GENOME 



RNA transcript 



(A) 



antisense probe used for 
in situ hybridization 



apical 



bnsal 



IB) 



three possible fates for mRNA 
localization in epithelial cells of 
the early Drosophila embryo 























• 




# 




# 


v • •• ?) 










unlegalized 


basal 


apical 


3' UTR 


3' UTR 


3' UTR 



<C) 



Figure 9-79 The importance of the 
UTR in localizing mRNAs to specific 
regions of the cytoplasm. Drosophila 
can be transfected with a 
recombinant DNA molecule coding 
for an mRNA in which a bacterial 
reporter sequence (encoding p- 
galactosidase) is linked to a chosen 3' 
UTR sequence. According to the 
choice of 3' UTR, the mRNA may be 
unlocalized in the embryonic cells, 
localized at their basal ends, or 
localized at their apical ends. (A) The 
recombinant DNA molecule used to 
test the effects of different 3' UTRs. 
(B) In situ hybridization (for p- 
galactosidase RNA sequences) shows 
that the 3' UTR determines the 
localization of the mRNA in the 
embryonic cells. (C) Photograph of an 
apically localized mRNA detected by 
in situ hybridization. The cells 
containing this mRNA are arranged in 
stripes along the axis of the embryo. 
(C, courtesy of David Ish-Horowitz.) 



Posttranscriptional Controls 



459 



,60 



RNA Editing Can Change th^eaning of the RNA Message { 

The molecular mechanisms used by cells are a continual source of surprises. An 
example is the phenomenon of trans UNA splicing, which occurs in all transcripts 
in the trypanosomes (the parasitic protozoa responsible for sleeping sickness). 
All mRNAs in trypanosomes possess a common 5' capped leader sequence that 
is transcribed separately and added to the 5' ends of RNA transcripts by splicing 
of two initially unconnected RNA molecules. Trans splicing is also used to add 
a 5' leader to several mRNAs in nematodes and to combine the separate RNA 
transcripts that form the coding sequence of some chloroplast and mitochondrial 
proteins in plant cells. Cutting and pasting between RNA transcripts could speed 
up the evolution of new proteins, and the few cases where exons are known to 
be joined in this way are suspected to be evolutionary remnants of a much more 
extensive process that dominated ancient cells. 

Another startling discovery is the process of RNA editing, whereby the nucle- 
otide sequences of RNA transcripts are altered. In this process, discovered in RNA 
transcripts that code for proteins in the mitochondria of trypanosomes, one or 
more U nucleotides are inserted (or, less frequently, removed) from selected 
regions of a transcript, causing major modifications in both the original reading 
frame and the sequence, thereby changing the meaning of the message. For some 
genes the editing is so extensive that over half of the nucleotides in the mature 
mRNA are U nucleotides that were inserted during the editing process. The in- 
formation that specifies exactly how the initial RNA transcript is to be altered is 
contained in a set of 40- to 80-nucleotide-long RNA molecules that are separately 
transcribed. These so-called guide RNAs have a 5' end that is complementary in 
sequence to one end of the region of the transcript to be edited; this is followed 
by a sequence that specifies the set of nucleotides to be inserted into the tran- 
script and then a continuous run of U nucleotides. The editing mechanism is 
surprisingly complex, with the U nucleotides at the 3' end of the guide RNA being 
transferred directly into the transcript, as illustrated in Figure 9-80. 



r , 

guide RNAs guide RNA 2 
poly-U tail T 



RNA transcript ! 



sites missing 
U nucleotides 



guide RNA 1 




I I I I II II I I I 

PAIRING TO 
GUIDE RNA 1 




nucleotides in 
guide RNA 
specifying missing 
U nucleotides 



5' 



i m i 1 1 1 1 ii i i i 



EDITING FOLLOWED BY 
PAIRING TO GUIDE RNA 2 




Figure 9-80 RNA editing in ih«- 
mitochondria of trypanosomr* 

Guide RNAs contain at their :r .* 
stretch of poly U, which doiut. •* 
nucleotides to sites on the UNA 
transcript that mispair with ihr 
RNA; thus the poly-U tail ^c*i , 
as editing proceeds (not shown* 
Editing generally starts near lit. 
end and progresses toward tin- 
of the RNA transcript, as shown 
because the "anchor sequenrr 
5' end of most guide RNAs qui 
only with edited sequences. 



fully edited mRNA ; 



460 Chapter 9 : Control of Gene Expression 



Extensive editing of mRNA sequences has also been found in the mitochon- 
tlflfi of many plants, with nearly every mRNA being edited to some extent. In this 
ttiKi, however, bases are changed from C to U in the RNA, without nucleotide 
Inicrtions or deletions. Often many of the Cs in an mRNA are affected by edit- 
|tt& changing 10% or more of the amino acids that the mRNA encodes. 

We can only speculate as to why the mitochondria of trypanosomes and 
ptimis make use of such extensive RNA editing. The suggestions that seem most 
Stmsonable are based on the premise that mitochondria contain a primitive ge- 
ntle system. There is evidence that editing is regulated to produce different 
IHHNAs under different conditions, so that RNA editing can be viewed as a primi- 
Hvii way to change the expression of genes. Trypanosomes are extremely ancient 
llngle-celled eucaryotes, which diverged very early on from the lineage leading 
to plants, animals, and yeasts (see Figure 1-16). Perhaps, therefore, the extreme 
VfWion of RNA editing found in their mitochondria is a holdover from very an- 
l Hmt cells, where most catalyses were carried out by RNA molecules rather than 
|>y proteins. 

UNA editing of a much more limited kind occurs in mammals. The first case 
fHricovered involved the apolipoprotein-B gene, where RNA editing produces two 
iyj>cs of transcripts: in one of these a DNA-encoded C is changed to a U, creat- 

a stop codon that causes a truncated version of this large protein to be made 
|o n t issue-specific manner. In another case a nucleotide change in the middle 
if>f <»n mRNA molecule changes a single amino acid in a transmitter-gated ion 
rhiinnel in the brain, significantly altering the channel's permeability to Ca 2+ . For 
Ipolipoprotein-B the editing is catalyzed in a very straightforward way: a protein 
binds to a specific sequence in the mRNA and then catalyzes the deamination 
llf a C to a U. It is not known whether the other cases of mammalian and plant 
UNA editing are protein-mediated in this way or whether, instead, they make use 
Of short RNA templates, as in trypanosomes. 

We now turn to controls that operate on the translation of mRNAs into pro- 
Wins. 

' Procaryotic and Eucaryotic Cells Use Different Strategies 
Specify the Translation Start Site on an mRNA Molecuile 61 

ill bacterial mRNAs a conserved stretch of six nucleotides, the Shine-Dalgarno 
^Ptfuence, is always found a few nucleotides upstream of the initiating AUG 
JOCion. This sequence forms base pairs with the 16S RNA in the small ribosomal 
liibunit and thereby correctly positions the initiating AUG codon in the ribosome. 
fhis interaction makes a major contribution to the efficiency of initiation and 
provides the bacterial cell with a simple way to regulate protein synthesis. Many 
irjinslational control mechanisms in procaryotes involve blocking the Shine- 
i)l\igarno sequence, either by covering it with a bound protein or by incorporat- 
ing it into a base-paired region in the mRNA molecule. 

Eucaryotic mRNAs do not contain a Shine-Dalgarno sequence. Instead, the 
Selection of an AUG codon as a translation start site is largely determined by its 
.proximity to the cap at the 5' end of the mRNA molecule, which is the site at 
Which the small ribosomal subunit binds to the mRNA and begins scanning for 
m initiating AUG codon (discussed in Chapter 6). The nucleotides immediately 
Surrounding the start site in eucaryotic mRNAs also influence the efficiency of 
WG recognition during the scanning process. If this recognition site is poor 
inough, scanning ribosomal subunits will ignore the first AUG codon in the 
MRNA -and skip to the second or third AUG codon instead. This phenomenon, 
iiiown as "leaky scanning," is a strategy frequently used to produce two or more 
proteins, differing in their amino termini, from the same mRNA. It allows some 
piles to produce the same protein with and without a signal sequence attached 
e>( its amino terminus, for example, so that the protein is directed to two differ- 
MH compartments in the cell. 



} v OM transcriptional Controls 



heme 



, Another important difference bITween eucaryotic and procaryotic transla- 
tion is that'eucaryotic ribosomes dissociate rapidly from the mRNA when trans- 
lation terminates. Thus reinitiation at an internal AUG codon after translation of 
a preceding open reading frame is much less efficient in eucaryotes than in 
procaryotes. Together, these differences — scanning from the 5' cap and a limited 
ability to reinitiate at internal AUG codons— explain why the vastanajority of 
eucaryotic mRNAs encode only a single protein and why the first AUG codon 
from the 5' end is usually the functional start site for translation. 

A few eucaryotic cell and viral mRNAs initiate translation by an alternative 
mechanism that involves internal initiation rather than scanning. These mRNAs 
contain complex nucleotide sequences, called internal ribosome entry sites, where 
ribosomes bind in a cap-independent fashion and start translation at the next 
AUG codon downstream. The details of this mechanism are not known. 

The Phosphorylation of an Initiation Factor 
Regulates Protein Synthesis 62 

Eucaryotic cells decrease their overall rate of protein synthesis in response to a 
variety of situations, including the deprivation of growth factors, infection by 
viruses, heat shock, and entry into M phase of the cell cycle. Much of this regu- 
lation is thought to involve the initiation factor elF-2, which is phosphorylated 
by specific protein kinases to decrease the overall rate of protein synthesis. 

The normal function of elF-2 is outlined in Figure 9-81. This protein forms 
a complex with GTP and mediates the binding of the methionyl initiator tRNA 
to the small ribosomal subunit, which then binds to the 5' cap of the mRNA and 
begins scanning along the mRNA. After an AUG codon is recognized, the bound 
GTP is hydrolyzed to GDP by the elF-2 protein, causing a conformational change 
in the protein and releasing it from the small ribosomal subunit. The large ribo- 
somal subunit then joins the small one to form a complete ribosome that begins 
protein synthesis. 

Because elF-2 binds very tightly to GDP, a guanine nucleotide releasing pro- 
tein (see Figure 15-50), designated elF-2B, is required to cause GDP release so 
that a new GTP molecule can bind and elF-2 can be reused (Figure 9-82A). The 
reuse of elF-2 is inhibited when it is phosphorylated because phosphorylated 
elF-2 binds to elF-2B unusually tightly, preventing the completion of nucleotide 
exchange. There is more elF-2 than elF-2B in cells, and even a fraction of phos- 
phorylated elF-2 can trap nearly all of the available elF-2B, thereby preventing 
the reuse of even the nonphosphorylated elF-2 and greatly slowing protein 
synthesis (Figure 9-82B). 

When the activity of a general translation factor, such as elF-2, is reduced by 
phosphorylation, one might expect that the translation of all mRNAs would be 
reduced equally. Contrary to this expectation, however, the phosphorylation of 



it 



v 1 



Figure 9-81 The role of elF-2 In \ 
initiation of protein synthesis. 



methionyl initiator tRNA 



mRNA 



60S ribosomal subunit 



elF-2 




PROtrii J 

SYN llflll 



40S ribosomal subunit 



l>ro 



462 Chapter 9 : Control of Gene Expression 



guanine nucleotide releasing 
protein, elF-2B 



Relive 




Figure 9-82 The elF-2 cycle. (A) The 

recycling of used elF-2 by a guanine 
nucleotide releasing protein (elF-2B). 
(B) elF-2 phosphorylation controls 
protein synthesis rates by tying up 
elF-2B. 



elF-2B 



fcn^mlve 

Hsj>B' 



PROTEIN KINASE 





PHOSPHOR YLATES 
elF-2 



PHOSPHORYLATED 
elF-2 SEQUESTERS 
ALL elF-2B AS AN 
INACTIVE COMPLEX 




IN ABSENCE OF,;-; 
ACTIVE elF-2B; , ' 
EXCESS 'elF-2^ ^ - 

remains in its. i,, 
Inactive; gdp- 
bound form 

AND PROTEIN f j 
SVKlTHESIS SLOWS 
DRAMAtiCALLY* 



W*2 can have selective effects, even enhancing the translation of specific 
$NAs. This can enable yeast cells, for example, to adapt to starvation for spe- 
IH S nutrients by shutting down the synthesis of all proteins except those that are 
^julred for synthesis of the missing nutrients. The details have been worked out 
Ii specific yeast mRNA that encodes a protein called GCN4, a gene regulatory 
HOI uin that is required for the activation of many genes encoding proteins that 
II? Important for amino acid synthesis. The GCN4 protein is produced by a spe- 
activation of the translation of its mRNA following amino acid starvation that 
flluiuced when elF-2 becomes phosphorylated. By a complex mechanism de- 
lading on competition between correct and incorrect ("decoy") sites of initia- 
-Jlih of translation near the 5' end of the GCN4 mRNA, the reduction of elF-2 
Jilvlty actually leads to an increase in the synthesis of the GCN4 protein. 
I (Regulation of the level of elF-2 is also important in mammalian cells as part 
$ the mechanism by which they can be induced to enter a nonproliferating, 
idling state (called G 0 ) in which the rate of protein synthesis is reduced to about 
fifth the rate in proliferating cells (discussed in Chapter 17). 



ffoteins That Bind to the 5' Leader Region of mRNAs 
^ludiate Negative Translational Control 63 

fbo translation of some mRNA molecules is blocked by specific translation re- 
gtmor proteins that bind near the 5' end of the mRNAs, where translation would 
Bherwise begin. This type of mechanism is called negative translation control 
figure 9-83). It was first discovered in bacteria, where it enables excess riboso- 
iM\ proteins to repress the translation of their own mRNAs — a form of negative 
!|iHlback regulation. 



Figure 9-83 Negative translational control. This form of control is 
mediated by a sequence-specific RNA-binding protein that acts as a trans- 
lation repressor. Binding of the protein to an mRNA molecule decreases the 
translation of the mRNA. Several cases of this type of translational control 
are known; the illustration is modeled on the mechanism that causes more 
ferritin (an iron storage protein) to be synthesized when the free iron 
concentration in the cytosol rises; the iron- sensitive translation repressor 
protein is called aconitase (see also Figure 9-86). 



ft AUG / 



mRNA 



STOP 



H 2 NE 



3C00H ON 



protein 



AUG 



STOP 



3' 



no protein made OFF 



translation 
repressor protein 



(^(transcriptional Controls 



463 



coding sequence 3' UTR 
— ■ II 1 



5'C 



5'C 



AAA 3' (J-globin mRNA 

] AAA 3' growth factor 
mRNA 



5'C 



W 



y histone mRNA 



HALF-LtFE 
> 10 hours 
30 minutes 



1 hour when cell is 
synthesizing DNA, but 
12 minutes when cell is 
not synthesizing DNA 



In eucaryotic cells a particularly well-studied form of negative translation^ 
control allows the synthesis of the intracellular iron storage protein ferritin to be 
increased rapidly if the level of soluble iron atoms in the cytosol rises. The iron 
regulation depends on a sequence of about 30 nucleotides in the 5' leader of the 
ferritin mRNA molecule. This iron-response element folds into a stem-loop struc- 
ture that binds a translation repressor protein called aconitase t which blocks the 
translation of any RNA sequence downstream (see Figure 9-83). Aconitase is an 
iron-binding protein, and exposure of the cell to iron causes it to dissociate from 
the ferritin mRNA, releasing the block to translation and increasing the produc- 
tion of ferritin by as much as 100-fold. 

Gene Expression Can Be Controlled by a Change 
in mRNA Stability 64 

Most mRNAs in a bacterial cell are very unstable, having a half-life of about 3 
minutes. Because bacterial mRNAs are both rapidly synthesized and rapidly 
degraded, a bacterium can adapt quickly to environmental changes. 

The mRNAs in eucaryotic cells are more stable. Some, such as that encod- 
ing B-globin, have a half-life of more than 10 hours. Others, however, have a half- 
life of 30 minutes or less. The unstable mRNAs often code for regulatory proteins, 
such as growth factors and gene regulatory proteins, whose production levels 
change rapidly in cells (Figure 9-84). Many of these RNAs are unstable because 
they contain specific sequences that stimulate their degradation. A long sequence 
rich in A and U nucleotides in the 3' untranslated region (UTR) of several mRNAs, 
for example, can, if transferred to other stable mRNAs by recombinant DNA tech- 
niques, cause them to be unstable. This AU-rich sequence appears to accelerate 
mRNA degradation by stimulating the removal of the poly-A tail found at the 
3' end of almost all eucaryotic mRNAs. Other unstable mRNAs contain recogni- 
tion sites in their 3' UTR for specific endonucleases that cleave the mRNA (Fig- 
ure 9-85). 



RNA stability. Tl m 

normal mRNAs with very diJTn - 
half-lives. The continuous nipi-i 
degradation of the mRNA mot" 
that encode various growth l;n 
allows their concentration to l>< 
changed rapidly in response u ■ 
extracellular signals. A signal im 
degradation in the 3' untransl.H 
region (UTR) determines the Im 
of the growth factor RNA (see I > 
9-85). Histones are needed ni.n 
form the new chromatin produ. 
during DNA synthesis; a lar^r . 
in the stability of their mRNA;. I 
to confine histone synthesis i<> 
phase of the cell cycle. 



coding sequence 3' UTR 



"1 



3AAAAA 3' 



• i 1 

1 AU-rich sequence 



degradation 

an evolutionary conserved 50-nucleotide 
AU-rich sequence in the 3' UTR promotes 
the removal of the poly-A tail and 
causes the mRNA to become unstable 



D A A AAA 3' 



□ i m il I A A AAA 3' 



degradation 

a repeated sequence in the 3' UTR 
promotes cleavage of the 3' UTR 
by a specific endonuclease. The 
fragments are rapidly degraded 



Control of R N A 
degradation. Special sequin- 
3' untranslated region (UTH ■ - 
unstable mRNAs are respon- < 
their unusually rapid degnul i 
indicated, AU-rich sequent v. 
in the 3' UTR of many shon I 
mRNAs cause a rapid remo\ i 
poly-A tail, which in turn m il 
RNA unstable. Other mRNA 
sequences in their 3' UTR Hi < 
as sites for specific endonm I. 
cleavage. 



464 Chapter 9 : Control of Gene Expression 



cytosolic aconitase 




IRON STARVATION 



| translation blocked 



AAA 3' 



cytosolic aconitase 



transferrin receptor mRNA 




AAA 3' 




EXCESS IRON 



J mRNA is stable and translated 
TRANSFERRIN RECEPTOR MADE 



O Fe 



| mRNA translated 
FERRITIN MADE 



AAA 3' 




| mRNA degraded 



AAA 3' 



(A) 



(B) 



, I he stability of an mRNA can be changed in response to extracellular signals 
Aftroid hormones, for example, affect a cell not only by increasing the transcrip- 
Jton of specific genes, but also by increasing the stability of several of the mRNAs 
^ded by these genes. Conversely, the addition of iron to cells decreases the 
jliib hty of the mRNA that encodes the receptor protein that binds the iron-trans- 

T^T^u^^ CaUSing 1CSS ° fXhls rece P tor to be made - Interestingly, 
5 8 * ?x?f ^ transferrin rece P tor mRNA ^ems to be modulated by the iron- 
jMltiye RNA-binding protein aconitase, which, as we discussed above, also 
^nirols ferritin mRNA translation. Here aconitase binds to the 3' UTR of the 
Jnsferrin receptor mRNA and causes an increase in receptor production 
taiumably by inhibiting the function of sequences that otherwise cause rapid 

S T n * e mRNA " ° n additi0n of iron > aconitase is leased from the 
$HNA, decreasing mRNA stability (Figure 9-86). 



65 



Elective mRNA Degradation Is Coupled to Translation 
^control of mRNA stability in eucaryotic cells is best understood for the 
JMNAs that encode histones. These mRNAs have a half-life of about 1 hour dur- 
m t he DNA synthesis (S) phase of the cell cycle, when new histones are needed, 

i K e ^ unst f a ^ d are degraded within minutes when DNA syn^™ 

^. If DNA synthesis during S phase is inhibited with a drug, histone mRNAs 
^mediately become unstable, perhaps because the accumulation of free his- 

1 ulli"iSs enCe ° f new DNA for t0 bind increases de ^ adation ™* 

■ / The regulation of histone mRNA stability depends on a short 3' stem-and- 

27!^^^T PlaCe f ^ P ° ly " A ^ PieSent at 3 ' end of other mRNAs 
?L Im A SpCCial ° IeaVage reaCtion ' which re <i ui * es base-pairing to a 

^ RNA ma rihonucleoprotein particle, creates this 3' end after the histone 
1 NA is synthesized by RNA polymerase II. If the 3' end is transferred to other 
*NAs by recombinant DNA methods, they also become unstable when DNA 
ftMhesis stops. Thus, as for other types of mRNAs, the degradation rate of 
aone mRNA is strongly influenced by signals near the 3' end, where mRNA 
% Nidation is thought to begin. 



Figure 9-86 Two posttranslational 
controls mediated by iron. In 

response to an increase in iron 
concentration in the cytosol, a cell 
increases its synthesis of ferritin in 
order to bind the extra iron (A) and 
decreases its synthesis of transferrin 
receptors in order to import less iron 
(B). Both responses are mediated by 
the same iron -responsive regulatory 
protein, aconitase, which recognizes 
common features in a stem-and-loop 
structure in the mRNAs encoding 
ferritin and transferrin receptor. 
Aconitase dissociates from the mRNA 
when it binds iron. Because the 
transferrin receptor and ferritin are 
regulated by different types of 
mechanisms, their levels respond 
oppositely to iron concentrations 
even though they are regulated by the 
same iron-responsive regulatory 
protein. (Adapted from M.W. Hentze 
etaL, Science 238:1570-1573, 1987; 
J.L. Casey et al., Science 240:924-928, 
1988.) 



^Uranscripti nal Controls 



465 



